all posts tagged 'prompt engineering'

Demystifying Artificial Intelligence in Non Profits - Webinar Recap

originally shared here on

Demystifying Al for Nonprofits - Practical Use Cases, Ethical Concerns, and How to Get Started

I recently gave a talk about artificial intelligence that was specifically catered to those in the nonprofit world. Here's a recap of the talk using Simon Willison's annotated talk format.


Introduction - Al is a tool for everybody.

I firmly believe that AI is a tool for everyone.

I’ve been immersed in technology ever since I built my first website at eight years old. For the last three decades, I've eagerly followed every major technological breakthrough, examining each under the lens of "okay, so what's useful with this one?"

This recent breakthrough in AI technology, in particular, gives me the same level of excitement that I got when I built my first website or jailbroke my iPhone for the first time.

There is so much potential with AI, and the best part is that you don't need to know everything about AI in order to get value from it—just a bit of training on how to integrate these tools into your life.

Think about your car: unless you're a gear head, you probably don't know the first thing about how pistons work within an engine, and yet you don't need to know that in order to drive it efficiently. You do, however, need take to take classes to learn how to operate it properly and safely.

The same goes for these new artificial intelligence tools. And here's some good news: like all of your ancestors before you, you can totally figure out how this new tool works with just a little guidance.

My hope is that this talk serves as the first step in your training process for learning about AI. You should leave here with a basic understanding of how these tools are designed to work, as well as some ideas for how to incorporate them into your life.


What is AI? - Artificial Intelligence is a field of science studying how to get computers to reason, learn, and act like humans.

So, what is artificial intelligence?

Artificial intelligence is a field of science focused on getting computers to act, think, and reason like humans.

Human intelligence, unlike other forms we see in nature, excels at pattern recognition and decision-making—two complex skills that AI aims to replicate.

A graph showing a select sampling of the various offshoots within artificial intelligence (e.g. machine learning, natural language processing, computer vision, etc.)

A common misconception about artificial intelligence is that it's one thing. While there are some who are working on artificial general intelligence (like HAL-9000), most researchers in the AI space aren't working on building an all-purpose form of intelligence. Instead, they focus on digitizing specific areas of intelligence.

For instance, natural language processing helps computers understand not just words but the meaning behind them, while computer vision enables machines to recognize and process visual information.

Each of these offshoots serves unique functions.

A helpful analogy is to think of AI as a toolkit, like walking into a hardware store and asking for a hammer.

The clerk will likely ask which kind because there are various types—sledgehammers, jackhammers, ball-peen hammers, etc.

AI is similar; you need to know what problem you’re solving in order to choose the right tool.

Recently, advancements in AI have led to generative AI models, like ChatGPT and Google’s Gemini, which can create new content. But to understand where generative AI fits, let’s discuss some foundational AI concepts.

Artificial intelligence is the parent circle, which contains all the disciplines we use to teach computers how to do "human things".

Artificial intelligence, as we discussed earlier, is a broad field focused on teaching computers to perform human-like tasks.

Within artificial intelligence, we can use machine learning to get a computer to teach itself without humans explicitly programming them.

Within the broad field of artificial intelligence, there's machine learning, where we teach computers to learn without direct human programming.

Within machine learning, deep learning enables machines to build representations of how complex things work in real life.

A subset of machine learning is deep learning, which allows computers to create complex digital representations of real-world objects.

Within deep learning, Generative Al creates new content based on patterns they learned through training.

After reaching this level, we enter generative AI, where computers use learned representations to generate new content based on recognized patterns.


Machine learning relies on labelled data (e.g. this is a picture of a traffic light and this is *not* a picture of a traffic light).

To explain machine learning, imagine teaching a computer to recognize a traffic light.

You’d feed it thousands of pictures of traffic lights and train it to differentiate between traffic and non-traffic lights.

After undergoing thousands (or even millions) of tests, the computer program can predict with increasing accuracy, for example, “Yes, this is a traffic light,” or “No, this is not a traffic light.”

You have to decide up front what you want to call a "traffic light." Do hand drawn pictures of traffic lights count? How about in some countries where they don't use traffic lights but rather people directing traffic? How about traffic lights intended for bicycle traffic rather than cars?

You want to make sure during its training that you give it data relevant to the task you want it to perform.

For example, edge cases arise.

  • Do you want your model to say that a hand-drawn traffic light counts as a traffic light?
  • Some countries don't use traffic lights, but rather use humans to direct traffic... do those count?
  • Newer traffic lights are geared toward specific modes of transportation, like bicycles. Are those traffic lights?

As you make these decisions and label your data accordingly, the training process leads to a model capable of identifying traffic lights based on patterns it’s learned.

You are helping with that labeling process every time you do a Captcha.

(By the way: every time you fill out a Captcha online, you are helping Google to train its models to recognize various elements it may encounter on the road. Thanks for the free labor, everyone!)

Deep learning takes machine learning a step further by identifying more complex elements within its training data and making even more nuanced predictions.

Machine learning is cool and has a ton of practical use cases, but what if we wanted to have the computer understand something more complex, like the color of the traffic light?

Neural networks are the form of AI that lets us pass in an image and have it tell more detailed information about it without humans expressly programming it to do so.

Deep learning takes machine learning a step further, using neural networks to analyze data in stages, like a detective reconstructing a crime scene. At each stage, the network gathers specific details—colors, shapes, textures—and then combines these details into a fuller, more nuanced picture, like a detective piecing together a mystery from small clues.

With our traffic light example, each layer in our neural network focuses on specific aspects of the image, such as color, shapes, or textures, to interpret complex visuals, like recognizing whether a traffic light is red, yellow, or green.

Deep learning helps computers identify the color of a traffic light in any condition (daytime/night time, rain/clear, etc.)

This depth is essential, especially in dynamic environments like self-driving cars, where traffic lights look different depending on the time of day, weather conditions, or lighting.

With enough examples, deep learning models can accurately identify traffic lights in all these conditions, forming the backbone of many AI applications, including autonomous vehicles and medical imaging.

All Machine Learning is just prediction!

The big takeaway about machine learning and deep learning is that they're primarily tools for making well-optimized predictions based on patterns in past data. They use advanced probability and optimization to make 'best-guess' predictions—calculations that may seem insightful but are based purely on mathematical patterns, not true understanding.

None of this stuff is actually "alive" or "conscious" (as best we can tell... more on that in the "black box problem" section below).

All it is doing is saying "based on what I've learned while training on the data you gave me, I am making a prediction that this image contains a traffic light, or this image contains a "green" traffic light."

Generative Al systems predict what word is most likely to come next in a sentence

Now, let’s take it further.

What if instead of guessing what is inside an image, we can take these models and have them predict what what word comes next in a sentence?

That's what generative AI is doing!

Large Language Models (LLMs) are trained on tons of text to predict what word will most likely appear next.

By training a neural network on vast amounts of text—like public domain books, Reddit comments, and YouTube transcripts—the model becomes exceptionally skilled at predicting the next word in a sentence, mimicking human-like responses across countless topics.

And that's what a large language model does!

If you give a prompt to one of these systems, it will use all the patterns it recognized in training and spit out a very convincing answer to your prompt.

There are lots of ways to predict content... you can do this with text, images, and even audio!

And even more impressive: you can run these models across all kinds of mediums.

Because under the hood, all generative AI tools (ALL of them!) are just running statistical predictions to guess at what is the most likely thing to happen.

If you want a model that can predict what word would come next in a sentence, you'd use ChatGPT or Gemini or Claude.

Images? Midjourney, DALL-E, etc.

Music? Suno.

Let’s pretend to be an LLM together!

At this point, I imagine you are either thinking I'm talking about witchcraft, magic, or complete gibberish... and I suppose at some level, each of those is possible.

But stick with me here while I drive home this point about how these prediction systems work by having the audience here be my collective large language model.

So I'll give you a prompt, and I want you to fill in the blank:

I am going to the store to pick up a gallon of ______?

If I ask you "I am going to the store to pick up a gallon of ______", what would you likely fill that in with?

(In this case, the live audience of this webinar universally said "milk", but I've also heard people say "ice cream", and I can definitively say that those are my kind of people.)

There's one small problem though: I actually didn't get the answer I was looking for. 😬

So I'm gonna give you a different prompt and see if I can get the answer I was looking for.

I am going to the hardware store to pick up a gallon of ______?

"I'm going to the hardware store to pick up a gallon of ______".

(In this case, the live audience universally said "paint", which was the word I was originally looking for.)

When you read the sentence for my first prompt and see "store", you subconsciously tap into your previous experiences with the word. If you grew up in Minnesota like me, you associate the word "store" with concepts like "grocery store", "Target", or "Walmart."

In that context, you are gonna be thinking about what they sell by the gallon in those places. Again, that's likely milk or ice cream.

In my second prompt, your brain is airlifted out of Target and dropped into a Menards or Home Depot. In this new context, you aren't thinking about milk anymore. You're thinking about paint, oil, water, or other chemicals that are sold by the gallon.

This shift in prompt context illustrates how generative AI works: it predicts based on the most likely answer, given the context.

Recap of Generative Al: 

1. Machine learning tools are only making predictions. (They don't “know” anything)
2. Generative Al are trained on tons of data to recognize patterns
3. Predicts what the next likely word/words will be that answer a given prompt (store / hardware store)

So, in summary: machine learning and deep learning models are about making predictions based on patterns in data.

Generative AI takes that one step further, creating new content based on what’s likely to come next in a sequence.

What is the point of all this?

I get that this is a lot, and it's overwhelming to have sixty years of advancements in machine learning thrown at you in about ten minutes.

So let's get to the point of all of this. Why does it matter that we have a computer program that just predicts the most likely word to finish a sentence?

Because it turns out that there are plenty of cases where it's really helpful to get the most likely response to a question!

It's not like you'd want to trust these things implicitly, because as we know, life doesn't always align with what is average.

So when we say "don't trust these things because they're not telling the truth", we mean it! They're not built to be "truthful"; they're built to be "the most likely to be truthful" (which is a big nitpick, for sure, but an important nuance to understand when working with AI!).

Take legal advice, for example. Again, do not trust these things for legal advice, but let's say you need to draft a non-disclosure agreement.

In the old days, you would go to a lawyer who would pull out their own template, make some specific modifications to fit your needs, and pass it along. There's three delicious billable hours right there.

Today, you could go to a large language model and describe the sort of things you'd want your NDA to contain. The LLM would then give you the most likely provisions that are included in NDAs. You could then take that draft and shoot it to your attorney for review. That's 30 billable minutes instead of 3 billable hours.

That's the power of AI. That's why I'm so excited for these generative AI tools. They aren't going to replace humans; they're going to augment them.


Tip 1: Get your own hands dirty.

Let’s move on to some practical tips for adopting AI in your organization.

My first tip: you gotta get your own hands dirty and get hands-on experience with these tools.

As a leader, experimenting directly with these tools will help you understand their potential and limitations.

In my career so far, I've noticed that most companies follow a path of hiring consultants to come in and help them adopt new technology. With AI, I encourage you to get familiar with it yourself before shelling out for third party advice.

Action step: Encourage yourself and your employees to use AI tools like ChatGPT for small tasks—drafting emails, summarizing reports, or answering questions—and share what they've learned with the team.

Tip 2: Encourage psychological safety.

My second tip is to foster psychological safety.

AI adoption requires trial and error, and studies show many employees hesitate to use AI tools at work due to fears of being seen as cheating or potentially automating themselves out of a job.

Create a culture where experimenting with AI is encouraged and celebrated.

Action step: Try running an “AI hackathon” where employees explore AI tools in a low-stakes environment, share their findings, and foster team learning.

Tip 3: Clean data is everything.

Third: clean data is essential.

AI models are only as good as the data they’re trained on, so ensure your organization’s data is organized and free from errors. The better your data, the better your AI models will perform.

And as we'll discuss in the pitfalls section: "dirty" data will lead to biased and inaccurate results.

Action step: Every company has at least one person who loves working with spreadsheets; tap into their skills to spearhead data-cleaning initiatives.

Tip 4: Start small, build up from there.

The fourth tip: start small.

Don’t try to replace entire workflows with AI right away. Start small, focusing on simple, manageable projects, and scale based on what works.

A great place to start is inviting an AI bot into your virtual meetings to record notes and generate summaries. Be careful to not set it up to "auto join" every meeting (you probably don't want it in a sensitive HR meeting, for example), but give that a try and see how it performs for you.

Action step: Try using AI to do event survey analysis, basic donor segmentation, or create copy for your newsletters or social media channels.

Tip 5: Iterate on your prompts.

Finally, I can't overstate the importance of continually iterating and improving on your prompts.

Remember our "store/hardware store" example? One word made a world of difference in the output.

Similarly, providing an LLM with a prompt like "Summarize this report" will yield different results from "Create a one-paragraph summary highlighting the most important program outcomes from this report."

The field of research which tries to figure out how to get the most out of these tools is called "prompt engineering". You can find tons of great resources online and on YouTube for how to best phrase things for different types of models. For example, the prompts that work best for ChatGPT are different than Claude. And the prompts you use for a text generator will be different than an image generator like Midjourney.

Prompt Chaining

« Prompt 1: You are an expert with filling out grant applications. Review this grant application and our organization’s mission statement. Provide a list of tangible ways we are best suited to win this application.

« Prompt 2: Using the list you generated in the previous prompt, create a cover letter for our grant application highlighting the ways we align with the grant’s purpose.

A prompt engineering tricks that I use all the time is called "prompt chaining."

Prompt chaining involves using the result from one prompt as the foundation for the next prompt.

Instead of asking an LLM to generate a cover letter for a grant application, you could first ask an LLM to review both a grant application and your organization's mission, and then provide a list of areas where there are synergies.

Then, you can take the results from that and ask it to write the letter.

Giving the models time to reason through their answers tends to lead to better outcomes.

An example of chain of thought prompting

Another prompt engineering trick I frequently reach for is called chain of thought.

With this technique, you are asking an LLM to think about a given problem from three distinct perspectives. You then ask it to act as one of those personas and critique the responses of the other two. Finally, you combine the results into a well-considered and well-rounded answer.

As an example: my son does not like to eat pizza. I know... it bums me out, too.

I provided ChatGPT with a bunch of backstory on my son and what we've tried to do to encourage him to try pizza. Then, I said to pretend you are a kindergarten teacher, a child psychologist, and a grandparent. As each of those personas, tell me what approach you would take to get my son to eat pizza.

Next, as each persona, I ask it to reflect on the answers of the other personas. For example, the child psychologist persona would consider the kindergarten teacher and grandma's perspectives and adjust their own response.

Finally, after all personas have reflected on each other's answers, I have the model summarize the best path forward.

This trick works exceptionally well across several different problems. As an engineer, I use it to consider system changes as an engineer, as an end user, and as a business executive. It can provide some insights which you may have otherwise missed.

Tips for Adopting Al

So in order to integrate AI successfully, treat it as a tool that augments, rather than replaces, human judgment.

Every time I fire up an AI assistant, I like to think of it as an eager intern who is exceptionally smart but exceptionally naive. I do not take its output as gospel; rather, I use it as a foundation and build on it from there.

The best way to integrate AI into your workflows is to use it for routine tasks, and keep human oversight for critical decisions.

Finally, I'll take this time to further emphasize that all AI outputs are based on probability, not the truth. Always review and adjust outputs as needed.


Ethical Considerations & Pitfalls: Bias in Al

Alright, we've covered what artificial intelligence is, and we've gotten through some tips for adopting AI into your organization.

Now, let's talk about areas where AI can fall flat.

First: bias.

If you recall, at the beginning of this talk, we described artificial intelligence as being focused on getting computers to be like humans.

Humans are inherently biased, and AI, trained on human-generated data, often reflects this bias. Achieving true “unbiased” AI is a complex, if not impossible, task.

I propose you think of AI in the same context: there is no such thing as an unbiased AI model.

AI models are only as good as the data with which you train it. Data is one of those things you can pretty easily screw up if you aren't attuned to all of the various forms of bias that could impact your data.

Examples of Bias in Al (Stereotyping Bias, Measurement Bias, and Selection Bias)

There are many different kinds of bias, but I wanted to highlight three specific forms as a starter:

Stereotyping bias: This occurs when AI models perform less accurately for certain groups due to their underrepresentation or misrepresentation in training data, as seen with YouTube's automatic captions, which struggle with Scottish, Indian, and African American accents.

Measurement bias Measurement bias happens when an AI model’s metrics or algorithms lead to systematically skewed outcomes, such as the Apple Card’s algorithm offering men higher credit limits than women with similar financial profiles.

Selection bias: Selection bias arises when training data lacks sufficient diversity, causing models to underperform for certain groups; for instance, breast cancer detection AI trained mainly on female patients performs less accurately for male patients.

There are many more forms of bias that you can research on your own, but the main takeaway here is that all systems are subject to bias depending on what data was used to train it. For this reason, you can't just rely on the output of an AI-led decision.

Ethical Considerations & Pitfalls: The 'Black Box Problem'

As mentioned earlier, another major issue is the “black box” problem.

Deep learning models are like locked safes—each layer hides its ‘reasoning’ behind many interconnected processes, making it nearly impossible for humans to interpret every decision-making step.

This lack of transparency, especially in high-stakes areas like criminal justice or credit scoring, means we’re left trusting the ‘safe’ without ever seeing inside, creating ethical and practical risks.

Once again, this is a reminder that we can’t just accept AI output as absolute truth; careful consideration and oversight are needed to avoid unintentional discrimination or bias.

Ethical Considerations & Pitfalls: It can’t do everything!

Literally every single time new technology drops, some wise guy emerges from the crowd and says, "well, I can't use [insert new tech] to do [insert obvious use case]".

Earlier in this talk, I led off by saying "AI is for everyone." Notice how I didn't say "AI is for every thing."

Of course you can't use AI for everything! AI is not a magic bullet. You gotta know how to deploy it effectively, which is in service of automating predictable, repetitive tasks.

Yes, wise guy, you are right: you aren't gonna want to deploy AI while leading a camping expedition in the Boundary Waters.

But after you complete your expedition and ask for feedback from the program's participants, you could use AI to process those responses and bucket them into understandable and actionable groups.

Ethical Considerations & Pitfalls: Content is (Literally) Average

If you've been paying attention during this entire talk, you'll notice I keep saying things like "AI is picking the most likely word to finish a sentence" and "machine learning is used to make predictions."

If you are relying on a tool to create the most likely response to something, you'll see quickly that the responses are kinda... average.

This can be advantageous, but it's also something to be aware of. By using output that is average by design, you run the risk of blending into everything else out there. (This, by the way, leads to the rise of slop, which is the AI equivalent of spam).

Now, this may be a trade off you are willing to accept in many cases. I, for one, often use AI as a therapist to help me make sense of some thoughts swirling around in my head. This works great, but I use the advice and feedback I get from the model and take it to a human therapist.

The other thing about the content being average: remember how we said that AI doesn't care about truthiness, but rather it cares about finding the thing that is most likely to be truthful? This leads to some concerning behavior called "hallucination", where it will make up facts which aren't actually facts.

You may recall headlines from a year ago where a lawyer used ChatGPT and it hallucinated cases. This sort of thing happens all the time with new technology, especially when it's used by people who aren't properly trained on how to use it (or are swayed by glitzy marketing campaigns which make promises that it can't possibly deliver).

Ethical Considerations & Pitfalls
Mitigation Strategies
- Use Al to assist, but keep human oversight
- Review Al outputs for biases and accuracy
- Make adjustments as needed

Now that you're aware of the pitfalls and risks of using artificial intelligence, how can you mitigate those risks?

Always treat AI as a supportive tool, maintaining human oversight—especially for important decisions where ethics and accuracy are critical.

Always review AI outputs for potential bias and inaccuracies.

Finally, adjust AI-generated content as needed to match your style and objectives. For instance, AI may draft a social media post, but tweaking it to align with your brand's voice adds value.


What's next? Spend ten hours doing tasks with generative Al!

We've covered what AI is, practical tips for adopting it, ethical concerns, and common pitfalls.

So, what's next for you?

Begin by dedicating 10 hours to using generative AI tools to build practical familiarity.

Try asking questions in areas you know well to see how AI performs, and notice where you’d add or change things.

Sharing what you learn with your team encourages experimentation and fosters a learning environment.


Prompt Engineering: How to Think Like an AI

originally shared here on

a fluffy baby orange kitten with a fluffy baby puppy on its back in a grassy field with an epic sunrise in the background

The first time I opened ChatGPT, I had no idea what I was doing or how I was supposed to work with it.

After many hours of watching videos, playing with many variations of the suggestions included in 20 MIND-BLOWING CHATGPT PROMPTS YOU MUST TRY OR ELSE clickbait articles, and just noodling around on my own, I came up with this talk that explains prompt engineering to anyone.

Ah, what is prompt engineering, you may be asking yourself? Prompt engineering is the process of optimizing how we ask questions or give tasks to AI models like ChatGPT to get better results.

This is the result of a talk that I gave at the 2023 AppliedAI Conference in Minneapolis, MN. You can find the slides for this talk here.

Regardless of your skill level, by the end of this blog post, you will be read to write advanced-level prompts. My background is in explaining complex technical topics in easy-to-understand terms, so if you are already a PhD in working with large language models, this may not be the blog post for you.

Okay, let's get started!

I know nothing about prompt engineering.

That's just fine! Let's get a couple definitions out of the way.

Large language model (LLM)

Imagine you have a really smart friend who knows a lot about words and can talk to you about anything you want. This friend has read thousands and thousands of books, articles, and stories from all around the world. They have learned so much about how people talk and write.

This smart friend is like a large language model. It is a computer program that has been trained on a lot of text to understand language and help people with their questions and tasks. It's like having a very knowledgeable robot friend who can give you information and have conversations with you.

While it may seem like a magic trick, it's actually a result of extensive programming and training on massive amounts of text data.

What LLMs are essentially doing is, one word at a time, picking the most likely word that would appear next in that sentence.

Read that last again.

It's just guessing one word at a time at what the next word will be.

That's a lot of words, Tim. Give me a demonstration!

Let's say we feed in a prompt like this:

I'm going to the store to pick up a gallon of [blank]

You might have an idea of what the next best word is. Here's what GPT-4 would say is the next most likely word to appear:

  • Milk (50%)
  • Water (20%)
  • Ice cream (15%)
  • Gas (10%)
  • Paint (5%)

I would've said "milk," personally... but all those other words make sense as well, don't they?

What would happen if we add one word to that prompt?

I'm going to the hardware store to pick up a gallon of [blank]

I bet a different word comes to mind to fill in that blank. Here's what the next word is likely to be according to GPT-4:

  • Paint (60%)
  • Gasoline (20%)
  • Cleaning solution (10%)
  • Glue (5%)
  • Water (5%)

All of those percentages are based on what the AI has learned from training on a massive amount of text data. It doesn't have opinions or preferences; it's just guessing based on patterns it has observed.

Prompt Engineering

You know how we changed one word in our prompt to get a different set of answers? That's prompt engineering!

A textbook definition might look like this:

Prompt engineering is optimizing an AI prompt for an accurate, relevant, and actionable response.

A less geeky definition might look like this:

Prompt engineering is getting really good at telling an AI what we want so it can give us back the best possible response.

A prompt engineer also understands the underlying patterns in how the AI works. Now that you have made it this far, you are ready to actually tackle those underlying patters!

I understand the concept.

Wonderful! You're well on your way to becoming a prompt engineer. Let's get a little more in depth.

When it comes to prompt engineering, you will want to keep four big points in mind:

  1. Write clear and specific instructions
  2. Give the model time to reason
  3. Understand the limitations of the technology
  4. Always be iterating

Let's tackle all of these one at a time.

Write clear and specific instructions

My years of training at the prestigious University of Minnesota's journalism school taught me the importance of writing clearly when writing news articles.

Yet unlike optimizing your words for a small column in a newspaper, when it comes to large language models: clear does not mean short.

Here's a few tips for making your prompts clear and specific:

Tip 1: Flattery is everything

Let's start with this prompt:

Give me three marketing campaign ideas.

Here's what you might get back:

Sure, here are three marketing campaign ideas:

1. Social Media Influencer Campaign: Collaborate with popular social media influencers who have a large following in your target market to promote your product or service. This campaign could include sponsored posts, product reviews, or influencer takeovers of your social media accounts.

2. Cause Marketing Campaign: Create a campaign that ties your product or service to a cause that aligns with your brand values. For example, if you sell eco-friendly products, you could partner with an environmental nonprofit and donate a portion of your profits to their cause. This type of campaign can help you connect with socially conscious consumers and build brand loyalty.

3. Interactive Digital Campaign: Create an interactive digital experience that engages your target audience in a fun and unique way. For example, you could create a game or quiz that ties into your product or service, or a virtual reality experience that showcases your brand's values or products. This type of campaign can help you stand out from your competitors and generate buzz on social media.

I think we could do better. These tips are quite generic and obvious to anyone who spent twenty minutes googling "marketing campaign ideas."

What about a prompt like this?

You are an expert in B2B marketing with a special focus on custom homebuilders who construct multi-million dollar homes. Give me three marketing campaign ideas.

Here's what you might get back:

Sure, here are three marketing campaign ideas for custom homebuilders who construct multi-million dollar homes:

1. Luxury Lifestyle Content Campaign: Create a content marketing campaign that focuses on the luxury lifestyle that your homes offer. This could include blog posts, videos, and social media posts that showcase the unique features and amenities of your homes, as well as the surrounding communities. You could also partner with luxury lifestyle publications and influencers to reach a wider audience.

2. Custom Home Showcase Event: Host a custom home showcase event to give potential buyers an up-close look at the quality and craftsmanship of your homes. This could include a tour of a recently completed home, a meet-and-greet with the builders and designers, and a networking event with other high-end vendors such as interior designers, luxury car dealerships, and private jet companies.

3. Personalized Direct Mail Campaign: Create a personalized direct mail campaign to reach out to high-net-worth individuals in your target market. This could include sending a custom-designed brochure that highlights your unique selling propositions, as well as a personalized letter that speaks to their specific needs and desires. You could also consider sending luxury gift boxes with branded items such as champagne flutes or gourmet chocolates to make a lasting impression.

These ideas are so much better because you placed the large language model in a specific role. This allows it to be specific with its feedback.

LLMs (especially ChatGPT) are great at role playing like this. You can have it pretend to be virtually anything... have it be a grandmaster chess player and critique your strategy!

However, it's important to be realistic with your flattery. Remember: LLMs are trained on stuff that actual humans wrote. If you say something like "pretend you're an alien with an IQ of 9000," it will hallucinate an answer (more on hallucinations in a bit).

The reason our marketing example worked so well is because marketing people have big egos and love to write blog posts to share what they've learned. (Hey, wait a minute...)

Tip 2: Provide additional context

Remember how we specified hardware store and got more contextually-appropriate answers?

Let's see how that plays out with another prompt:

Tell me about climate change.

Here's what you might get back:

Climate change refers to the long-term changes in the Earth's climate, including changes in temperature, precipitation patterns, and weather events, that have been observed over the past century and are projected to continue into the future.

The primary cause of climate change is the increase in atmospheric concentrations of greenhouse gases (GHGs), such as carbon dioxide, methane, and nitrous oxide. These gases trap heat in the Earth's atmosphere, leading to an overall warming of the planet. The burning of fossil fuels for energy is the main source of GHG emissions, but deforestation, agriculture, and industrial processes also contribute to the problem.

Again, that's not entirely useless information, but I think we could do better.

What if you were specifically interested in how climate change impacts global sea levels?

Can you discuss the impact of climate change on global sea levels?

Here's what you might get back:

Climate change has contributed to the rising of global sea levels, primarily through the melting of glaciers and ice caps and the thermal expansion of seawater. As the planet warms, glaciers and ice caps around the world are melting at an accelerated rate, which is adding more water to the oceans. Additionally, as seawater warms, it expands and takes up more space, further contributing to sea level rise.

Sea level rise can have significant consequences for coastal communities and infrastructure. It can lead to flooding, erosion, and storm surges that can damage buildings, roads, and other critical infrastructure. These impacts are expected to become more severe and frequent as sea levels continue to rise.

A more contextually-specific prompt is going to give you a more contextually-specific response.

Tip 3: Ask for structured output

Let's consider this prompt:

Tell me about the solar system

You can probably guess where this is going, but here's what you might get back:

The solar system is the collection of planets, moons, asteroids, comets, and other celestial objects that orbit around a central star, the Sun. It is located in the Milky Way galaxy, about 25,000 light-years from the galactic center. The solar system is about 4.6 billion years old, and it formed from a rotating cloud of gas and dust called the solar nebula.

The eight planets of the solar system are Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune. The four inner planets (Mercury, Venus, Earth, and Mars) are rocky and relatively small, while the four outer planets (Jupiter, Saturn, Uranus, and Neptune) are much larger and made mostly of gas and ice.

The solar system also contains numerous moons, with each planet having its own set of natural satellites. Some of the most famous moons include the Earth's Moon, Jupiter's Europa and Ganymede, and Saturn's Titan.

In addition to the planets and moons, the solar system contains a large number of smaller objects, such as asteroids and comets. Asteroids are rocky objects that orbit the Sun, while comets are made of ice and dust and have a characteristic tail when they pass near the Sun.

The solar system is held together by gravity, with the Sun being the most massive object and therefore exerting the strongest gravitational pull. The movements of the planets and other objects in the solar system are governed by the laws of physics, particularly gravity and orbital mechanics.

The study of the solar system is an important area of research in astronomy and planetary science, as it helps us understand the origins and evolution of our own planet and the universe as a whole.

Again, this isn't awful... in fact, we could take a moment to be impressed with such a robust reply to such a basic prompt.

But I think we can do better!

What if we asked it for a list?

Give me a list of planets in the solar system along with one interesting fact about each planet.

Here's what you might get back:

Sure, here's a list of the eight planets in our solar system along with an interesting fact about each one:

1. Mercury - Mercury is the smallest planet in the solar system and also the closest planet to the Sun. Its surface temperature can reach up to 800 degrees Fahrenheit during the day, but it drops to -290 degrees Fahrenheit at night.

2. Venus - Venus is often called Earth's "sister planet" because it is similar in size and composition, but its atmosphere is incredibly thick and hot, with temperatures that can reach up to 864 degrees Fahrenheit.

3. Earth - Our home planet is the only planet in the solar system known to support life. It has a powerful magnetic field that protects us from the solar wind, and its atmosphere helps regulate the planet's temperature.

4. Mars - Mars is often referred to as the "Red Planet" because of its distinctive rusty color. It is home to the largest volcano in the solar system, Olympus Mons, which stands at over 13 miles high.

5. Jupiter - Jupiter is the largest planet in the solar system, with a diameter of over 86,000 miles. It also has the most moons of any planet, with over 79 confirmed satellites.

6. Saturn - Saturn is known for its beautiful rings, which are made up of ice particles ranging in size from dust to small boulders. It is also the least dense planet in the solar system - if there was a big enough bathtub, Saturn would float!

7. Uranus - Uranus is unique in the solar system because it rotates on its side. Its axis is tilted at an angle of 98 degrees, so its poles lie where most other planets have their equators.

8. Neptune - Neptune is the farthest planet from the Sun and is known for its striking blue color. It is also home to the strongest winds in the solar system, with gusts that can reach up to 1,200 miles per hour.

Poor Pluto... Neil DeGrasse Tyson really did a number on it, didn't he?

This is pretty good, but I think we could do one step better. Given I'm a nerd and spend a fair amount of time in JSON-land, here's a very specific prompt:

Provide me with a JSON output of the planets in the solar system with the following structure. When you are presented with square brackets, fill in that data with the correct information about the given planet:

{ planet: "Earth", distancefromsun: "[distance in miles]", fun_fact: "[1 sentence that would be interesting to a 5 year old"] }

Here's what you might get back:

Here's a JSON output of the eight planets in our solar system, using the structure you provided:

[
  {
    "planet": "Mercury",
    "distance_from_sun": "36 million miles",
    "fun_fact": "Mercury is the smallest planet in the solar system, but it's also the fastest!"
  },
  {
    "planet": "Venus",
    "distance_from_sun": "67 million miles",
    "fun_fact": "Venus is the hottest planet in the solar system, even hotter than Mercury!"
  },
  {
    "planet": "Earth",
    "distance_from_sun": "93 million miles",
    "fun_fact": "Earth is the only planet in the solar system that we know has life on it!"
  },
  {
  "planet": "Mars",
  "distance_from_sun": "142 million miles",
  "fun_fact": "Mars has the tallest volcano in the solar system, called Olympus Mons!"
  },
  {
    "planet": "Jupiter",
    "distance_from_sun": "484 million miles",
    "fun_fact": "Jupiter is the largest planet in the solar system - you could fit all the other planets inside it!"
  },
  {
    "planet": "Saturn",
    "distance_from_sun": "886 million miles",
    "fun_fact": "Saturn has the most spectacular rings in the solar system - they're made up of ice and rock!"
  },
  {
    "planet": "Uranus",
    "distance_from_sun": "1.8 billion miles",
    "fun_fact": "Uranus is tilted on its side, so its seasons are really weird!"
  },
  {
    "planet": "Neptune",
    "distance_from_sun": "2.8 billion miles",
    "fun_fact": "Neptune is the windiest planet in the solar system - its winds can reach over 1,200 miles per hour!"
  }
]

We provided a sample of the structure we wanted, and it gave us back exactly what we were looking for. Not bad for a magic trick that's just guessing at the next right word, eh?

You don't have to be a nerd to benefit from this structure tip either. Ask for things in CSV format or in a table so you can copy it into Excel.

The key takeaway here is that LLMs really like having a clear structure in which they can provide their response. Structured prompts give the AI a clearer understanding of what's expected in the response, leading to more accurate and relevant answers.

Speaking of this, our next tip will also help in setting your prompts up for success:

Tip 4: One/Few Shot Prompting

This is the fanciest-sounding tip of this section, but I promise it's not conceptually difficult for you to get. In fact, we did it in our previous example!

Let's start with a zero-shot prompt:

Continue the dialogue in a sci-fi movie.

You might get a response like this:

Character 1: "Captain, we have a problem. The warp drive has gone offline and we're stranded in deep space."

Captain: "What's the damage?"

Character 1: "It looks like there was a power surge that overloaded the system. We're lucky we didn't blow out the entire engine room."

Captain: "Can we fix it?"

Character 1: "We'll need to reroute power from the backup systems and replace some of the damaged components. But it's going to take some time."

Character 2: "Captain, we're picking up an unusual energy signature on our long-range sensors. It's coming from a nearby planet." [...]

Once again, this is impressive and not entirely terrible. With zero prompting, we were able to generate some dialogue that feels like it could've come from a sci-fi movie.

Let's get a little more specific with one-shot prompting:

Continue this line from a Star Wars movie: 'Luke, I am your father.'

Darth Vader: "Search your feelings, Luke. You know it to be true."

Luke: "NOOOOOOOOOOOOOOO!"

Luke backs away from Darth Vader, unable to believe what he's just heard. The revelation that his father, who he thought was a great Jedi Knight, was actually the villainous Darth Vader is almost too much for him to bear.

Darth Vader: "Join me, and together we can rule the galaxy as father and son."

Luke: "I'll never join you! You killed my father!"

Darth Vader: "I am your father, Luke. And you have much to learn about the ways of the Force."

Much better. We provided some sample information in the form of one-shot ("Luke, I am your father") and despite that exact phrase never appearing in a Star Wars movie (look it up), it does its best to continue the dialogue after that general phrase is spoken.

How could we improve on this? Why, with few-shot prompting, of course!

Here's our few-shot prompt:

Continue this line from a Star Wars movie: 'May the Force be with you.' Response: 'And also with you.'

Continue this line from a Star Wars movie: 'I’ve got a bad feeling about this.' Response: 'You always have a bad feeling about this.'

Now, continue this line from a Star Wars movie: 'Luke, I am your father.

And here's a potential response:

Response: "Obi-Wan never told you what happened to your father."

Luke: "He told me enough! He told me you killed him!"

Darth Vader: "No. I am your father."

Luke stares in disbelief at Darth Vader, trying to process the information he has just been given. He had always thought his father was a great Jedi Knight who fought alongside Obi-Wan Kenobi, but now he realizes that his father is actually the man he has been fighting against.

Luke: "No, that's not true! That's impossible!"

Darth Vader: "Search your feelings, you know it to be true."

Luke looks down, realizing the truth of Vader's words. He had always felt a connection to the Force, and now he understands why. He is the son of one of the most powerful Force users in the galaxy. [...]

First thing to note: notice how it used the word Response: in its response? It followed the pattern we established with our sample few-shot prompt. It also provided us a different section of the dialogue.

Few-shot prompting helps give the LLM more insight into the response you are expecting.

Give the model time to reason

Let's pump the brakes for a second and be very clear about what we mean with the word "reason."

LLMs are not human. They are just really good at following a set of instructions to provide the statistically appropriate response to your prompt. In this way, it might be more appropriate to say they simulate reasoning.

Here are a couple ways you can give the model time to reason its way to a more accurate response.

Tip 1: Specify the steps needed to complete a task

Sometimes, life is a little easier when we write down the steps we need to take before jumping in and doing something. LLMs benefit from the same strategy!

Let's come up with some absurd example to illustrate this point using LLMs. Let's say I need to count the number of verbs of a translated summary of Star Wars and return that (as well as the translation) to a JSON object.

If we just threw that into the LLM, it might give us a weird response.

Instead, here's a good prompt that gives the model the steps needed to complete a task:

Perform the following actions:

1. Summarize the following text delimited by triple dashes with 3 ampersands. 2. Translate the summary into German. 3. Count the number of verbs in the German summary. 4. Output a json object that contains the following keys: germansummary, numverbs.

Separate your answers with line breaks.

&&& In a galaxy far, far away, a young farmer named Luke Skywalker discovered his true heritage as a Jedi Knight and embarked on a heroic journey to save Princess Leia, defeat Darth Vader, and restore balance to the Force. &&&

Here's a potential response to that prompt:

Luke Skywalker, a young farmer, discovers he's a Jedi Knight and goes on a journey to save Princess Leia, defeat Darth Vader, and restore balance to the Force in a galaxy far, far away.

Luke Skywalker, ein junger Bauer, entdeckt seine wahre Herkunft als Jedi-Ritter und begibt sich auf eine heroische Reise, um Prinzessin Leia zu retten, Darth Vader zu besiegen und das Gleichgewicht in der Macht wiederherzustellen, in einer weit entfernten Galaxie.

4

{
    "german_summary": "Luke Skywalker, ein junger Bauer, entdeckt seine wahre Herkunft als Jedi-Ritter und begibt sich auf eine heroische Reise, um Prinzessin Leia zu retten, Darth Vader zu besiegen und das Gleichgewicht in der Macht wiederherzustellen, in einer weit entfernten Galaxie.",
    "num_verbs": 4
}

Notice how it performs each step before generating the final JSON object.

Tip 2: Instruct the model to work out its own solution before coming to a conclusion

Imagine I walk up to you and say, "Here's a complex algebra equation. Solve for X. Now."

If I didn't give you a whole lot of time to reason your way through the problem, you are likely to freak out and just throw down the first response you have.

Instead, what if I said "Here's an algebraic equation. Solve for X and show your work?" You are much more likely to give the correct answer.

Here's a prompt we can give to a large language model:

Give me a meal plan without gluten.

And here's a possible response to that:

Here's a sample meal plan for a day without gluten:

Breakfast:
- Gluten-free oatmeal with almond milk, topped with fresh berries and sliced almonds.
- Scrambled eggs with sautéed spinach and sliced avocado.

Snack:
- Fresh fruit (e.g. apple, pear, or banana) with almond butter.

Lunch:
- Grilled chicken breast with roasted sweet potatoes and steamed broccoli.
- Quinoa salad with mixed greens, cherry tomatoes, cucumber, and chickpeas, dressed with olive oil and lemon juice.

Snack:
- Carrots and celery sticks with hummus.

Dinner:
- Baked salmon with roasted asparagus and brown rice.
- Grilled sirloin steak with roasted Brussels sprouts and mashed sweet potatoes.

Dessert:
- Fresh fruit salad with coconut whipped cream.

Remember to always read food labels and double-check ingredients to make sure they are gluten-free, as some products may contain hidden sources of gluten. Additionally, if you have celiac disease or a gluten sensitivity, it's best to consult with a registered dietitian or medical professional for personalized dietary advice.

This is... an answer. I would argue (a) this is not a meal plan because it only represents a single day, and (b) it's not very personalized to my specific needs. It even mentions things like celiac disease in its disclaimer at the bottom, but we aren't sure that this is actually a solid meal plan for us.

Let's try a prompt where we put the LLM in a position to work through its own solution:

Craft a balanced seven-day meal plan for a person with Celiac Disease, considering the following constraints:

1. Each day should include breakfast, lunch, dinner, and two snacks. 2. Meals should be diverse and not repetitive. 3. Each day's meals should meet the recommended daily intake for all essential nutrients. 4. All recipes must be gluten-free due to Celiac Disease.

Start by creating a list of potential gluten-free ingredients and their nutritional values. Then, use this list to generate a diverse selection of meals for each day, ensuring that the nutritional guidelines are met. Lastly, present the meal plan in a structured format, with each day separated and meals clearly labeled.

This actually feels like what an actual dietician would do (or, perhaps, a dietician who is just starting their career). They would first start by creating a list of ingredients and determining their nutritional values. Then, they would work off of that list to create a meal plan that is nutritious and diverse.

Here's a possible response for this prompt:

List of Gluten-Free Ingredients and Their Nutritional Values:

- Quinoa: high in protein, fiber, magnesium, and iron
- Brown rice: high in fiber, magnesium, and selenium
- Sweet potatoes: high in vitamin A, vitamin C, and fiber
- Broccoli: high in vitamin C, vitamin K, and fiber
- Spinach: high in vitamin A, vitamin C, and iron
- Almonds: high in protein, healthy fats, and vitamin E
- Chickpeas: high in protein, fiber, and iron
- Salmon: high in omega-3 fatty acids, protein, and vitamin D
- Eggs: high in protein, vitamin B12, and vitamin D
- Greek yogurt: high in protein and calcium
- Berries: high in fiber and antioxidants
- Avocado: high in healthy fats and fiber

7-Day Gluten-Free Meal Plan:

Day 1:

- Breakfast: Quinoa breakfast bowl with Greek yogurt, berries, and sliced almonds.
- Snack: Apple slices with almond butter.
- Lunch: Broccoli and chicken stir-fry served over brown rice.
- Snack: Carrot sticks with hummus.
- Dinner: Grilled salmon with roasted sweet potatoes and sautéed spinach.

Day 2:

- Breakfast: Spinach and feta omelet with gluten-free toast.
- Snack: Greek yogurt with mixed berries and granola.
- Lunch: Chickpea and quinoa salad with cucumber, cherry tomatoes, and lemon-tahini dressing.
- Snack: Banana with almond butter.
- Dinner: Grilled chicken with roasted Brussels sprouts and mashed sweet potatoes.

Day 3:

- Breakfast: Sweet potato and black bean breakfast burrito with scrambled eggs.
- Snack: Almonds and dried cranberries.
- Lunch: Spinach and salmon salad with avocado and grapefruit segments.
- Snack: Celery sticks with almond butter.
- Dinner: Broiled sirloin steak with roasted asparagus and quinoa.

Day 4:

- Breakfast: Gluten-free oatmeal with almond milk, chopped almonds, and banana slices.
- Snack: Carrot sticks with hummus.
- Lunch: Grilled chicken and vegetable kebabs with quinoa.
- Snack: Greek yogurt with mixed berries.
- Dinner: Baked salmon with roasted sweet potatoes and sautéed spinach.

Day 5:

- Breakfast: Scrambled eggs with gluten-free toast and sliced avocado.
- Snack: Apple slices with almond butter.
- Lunch: Brown rice and vegetable stir-fry with tofu.
- Snack: Almonds and dried apricots.
- Dinner: Grilled chicken with roasted Brussels sprouts and mashed sweet potatoes.

Day 6:

- Breakfast: Gluten-free pancakes with mixed berries and maple syrup.
- Snack: Greek yogurt with mixed berries and granola.
- Lunch: Grilled salmon with mixed greens, cherry tomatoes, and cucumber salad.
- Snack: Banana with almond butter.
- Dinner: Brown rice and vegetable stir-fry with shrimp.

Day 7:

- Breakfast: Quinoa breakfast bowl with Greek yogurt, berries, and sliced almonds.
- Snack: Carrot sticks with hummus.
- Lunch: Broiled sirloin steak with roasted asparagus and mashed sweet potatoes.
- Snack: Almonds and dried cranberries.
- Dinner: Grilled chicken with roasted sweet potatoes and sautéed spinach.

Note: The meal plan is just an example, and it's important to personalize it according to individual nutritional needs and preferences. Also, be sure to check food labels and ingredient lists to ensure that all products used in the recipes are gluten-free.

See? That's a much better response and a much better meal plan.

We gave the model one task to work on at a time, which encouraged the model to provide a more thoughtful, nuanced response. It also gives the model a chance to take into account various factors that could be overlooked with a simpler prompt.

Limitations of large language models

Ahh, the fine print. All this magic is not without flaws.

When working with LLMs, keep these things in mind:

Limitation 1: Inaccurate and incorrect information

LLMs are only as good as their training data (um, yeah... roll with me for a second). If it was trained with inaccurate information, or if an event occurred after its training took place, it might not give you legit information.

Also, these things are trained on the internet. You ever been on the internet? Just look at your family's Facebook accounts to see some of the crap people believe out there. If you just believe everything these LLMs tell you without doing some critical thinking, you're a fool.

Here, take this prompt that I wrote on May 10th, 2023:

What highway did Minnesota dedicate to Prince?

Here's how it responded:

Minnesota dedicated a portion of Highway 5 to Prince in 2018, renaming it "Prince Memorial Highway." The section of Highway 5 runs through the city of Chanhassen, which is where Prince's Paisley Park estate is located.

Well, this is pretty good, all things considered! Except it's not true.

The Minnesota state legislature approved this change two days prior to when I gave it that prompt (so it's 2023, not 2018). It's also named the "Prince Nelson Rogers Memorial Highway".

Again, the rest is actually accurate. It did occur on Highway 5, which runs through Chanhassen, which is where Paisley Park is located.

It was able to reason that much correctly because that would statistically make the most sense. But it got the date and name wrong.

While LLMs can provide incredibly useful and accurate information, it's essential to cross-check with reliable source, particularly for critical or recent information.

Limitation 2: Hallucinations

This concept of making stuff up that sounds right is what we in the biz call a "hallucination."

Here's a prompt that legit caused me to worry:

Tell me about Tim Bornholdt

I'm not an expert on virtually anything... but if anyone could claim to be an expert on Tim Bornholdt, it would be me.

Here's how it responded:

Tim Bornholdt is a software developer and the co-founder of a software development company called The Jed Mahonis Group, which is based in Minneapolis, Minnesota. Bornholdt co-founded the company in 2012 with Jedidiah Weller, and the company specializes in mobile app development for iOS and Android platforms.

Bornholdt is a graduate of the University of Minnesota, where he studied computer science. He has been working in the software industry for over a decade, and his expertise includes software development, mobile app development, project management, and business strategy.

In addition to his work at The Jed Mahonis Group, Bornholdt is also a co-host of the podcast "Constant Variables," which explores the world of mobile app development and entrepreneurship. He has also been a guest speaker at various tech and entrepreneurship events, including the Twin Cities Startup Week and the University of Minnesota's Technological Leadership Institute.

This is nearly perfect. Scary, honestly.

Except it's subtly wrong.

I started JMG with Robert Bentley, not Jedidiah Weller.

I didn't study computer science, I studied journalism (but I guess I did minor in computer science, so partial credit).

I would not consider myself a project management expert.

I have not hosted Constant Variables in over a year.

I've also never spoken at the U of M's Technological Leadership Institute.

All of those seem like perfectly reasonable facts, right? It's not absurdly wrong. It's just... subtly wrong.

And that's because these LLMs are not necessarily interested in telling you the truth. They are interested in giving you the statistically most probably answer to a question.

It's not absurd for the algorithm to think I started a business called "Jed Mahonis Group" with someone named "Jedidiah". It's also not absurd to think I studied computer science given my career in technology.

But the beautiful thing about us humans is that while you can usually predict how we'll act within a reasonable degree of accuracy, we are not statistical models. We are flawed, irrational, impulsive beings.

When you are working with large language models, the old Russian proverb reigns supreme: "trust, but verify."

Always be iterating

Your final lesson in this section is all about embracing what LLMs and neural networks do best: iteration.

I consumed around 40 hours of prompt engineering content to build this talk, but only one piece of advice still sticks with me: You will never get your prompt right the first time.

Everyone from YouTube streamers to folks with their PhD in artificial intelligence agreed that they rarely get complex prompts built their first time.

These machines are constantly learning from themselves. They are learning what people actually mean when they ask certain questions. They get better through further training.

You could be the same way! You could take your initial prompt, review the output, and give it a slightly different prompt.

It's why working with LLMs is so much fun. If you were to ask a human the same question five different ways, they would likely be confused at best and upset at worst.

If you were to ask an LLM the same question five different ways, you are likely to get five subtly different responses.

Don't stop on your first crack at a prompt. Keep playing with your order of words, ask it for a different structure, give it different steps to complete a task. You'll find the more you practice, the better you can use this tool to its greatest potential.

I'm pretty advanced.

Hey, well, now look at you! You've graduated Prompt Engineering 101, and you are now ready to take prompting to the next level.

There are four main concepts we want to cover in this section. These terms may look highly technical, and that's because they kind of are. However, just because something is highly technical doesn't mean we can't make it easy to understand! Stick with me here, I promise you'll be able to figure this out.

One final note before we continue: most of these settings are not able to be set within the ChatGPT interface, but if you directly access GPT-4 APIs, you are able to fine tune these settings.

Here's what we'll cover in this final section:

  1. Temperature
  2. Top-k sampling
  3. Max tokens
  4. Prompt chaining

Temperature

Remember our "I'm going to the store to pick up a gallon of [blank]" example from above?

We had five possible words, each with a different percentage chance of the LLM choosing it as the next word.

Temperature is a setting that determines how likely the LLM is to pick the most likely word. A value of 0.0 means it will always pick the most likely word. A higher temperature (like 2.0) means it is more likely to pick a less likely word.

That's a little confusing... here's a good prompt that will help make this a little more clear:

Suggest a unique theme for next year’s Applied AI Conference

If we have a temperature of 0.0, here are three different ways for how the LLM might respond to that prompt:

Artificial Intelligence: Transforming the Future
Artificial Intelligence: Transforming the Future
Artificial Intelligence: Transforming the Future

Notice how all of the responses are the same? That's because it's always gonna pick the most likely word.

Let's bump it up a bit to 1.0 (which is the default temperature for GPT-3.5):

AI and the Renaissance of Interdisciplinary Innovation
Unleashing Creativity: The Intersection of AI and Art
The AI Frontier: Ethical Challenges and Opportunities

Nice. Those responses are a little more creative and diverse than the lower temperature.

What if we got extra wild and bumped it up to 2.0?

Alien Mindscapes: AI, Consciousness, and Cosmic Thought
The Quantum Quandary: Dissecting AI's Subatomic Secrets
Chronomancy Chronicles: AI and the Temporal Transformation

These are quite creative responses, and one of them actually taught me a new word (chronomancy, which is the divination of the best time to do something).

As you increase the temperature, you get more creative responses, but you also run the risk of the LLM getting off track.

Top-k sampling

Okay, quickly going back to our "I'm going to the store" prompt... remember how we looked at the next 5 words?

What if we looked at the next 1 word? What if we looked at the next 500 words?

Top-k sampling is a value that lets you decide how many words you want to allow the LLM to look at when determining what the next word could be.

Let's use this prompt to illustrate it:

Explain how a combustion engine works.

If we have a low top-k value (like 5), here's what our responses might look like:

A combustion engine works by ignition, ignition, ignition, and more ignition.
Combustion engine involves fuel, fuel, fuel, and fuel again.
It's all about combustion, combustion, combustion, and combustion.

It's all about combustion, baby!

Notice how those responses are rather repetitive. What happens if we bump it to something like 40?

A combustion engine works by burning fuel in a confined space to create power. This process involves four steps: intake, compression, combustion, and exhaust.
In a combustion engine, fuel is ignited in a combustion chamber, which generates force to move the pistons. This cycle repeats, powering the engine.
The functioning of a combustion engine revolves around the principle of converting fuel into mechanical energy through a series of combustions. This involves intake, compression, power, and exhaust strokes.

That looks better! The responses are no longer repetitive, and seem to be a little more creative.

Now if you're like me, you're probably thinking, "what if we get wacky here and go up to a top-k of 100?"

Well...

When it comes to combustion engines, it's like a dance of mechanical parts and chemical reactions. You have your pistons, valves, and cylinders all choreographed in a ballet of power production.
A combustion engine is a marvel of engineering, where fuel and air meet in a dance of power and precision. Like a thunderstorm contained within a metal shell, it is a testament to the power of human ingenuity.
Imagine a concert, where the orchestra is made of metal parts, and the symphony is played not with musical notes, but with controlled explosions. That's a combustion engine.

These are very creative and diverse responses! The downside is that with a high top-k, you end up with some potentially incoherent responses or, much like myself, the responses can tend to get overly metaphorical.

Max tokens

For the purposes of this talk, I didn't into the concept of a token, but hey, this is a blog post, so we can stretch out a wee bit, eh?

In the world of large language models, a token is a piece of a sentence.

Let's imagine a world where tokens were ingredients in a recipe. Each ingredient, whether it's salt, paprika, chicken, or flour, plays an important role in the final dish.

In this metaphor, the language model is the chef who knows how to use the right mix of ingredients to make a tasty dish.

What if we started limiting the chef to a certain number of ingredients they can use in their dish?

Welcome to the world of max tokens!

Here's a delicious baking metaphor we can use to illustrate the concept as well:

Provide a step-by-step guide on how to make a sourdough starter.

If we limited our LLM to 50 tokens, here's a possible response:

Sourdough starter requires flour, water, and time. Here's a basic process:

1. Mix equal parts flour and water in a jar.
2. Cover loosely and let it sit at room temperature.
3. Feed it daily with equal parts flour and water.
4. After about a week, it should...

Wait... what happens after a week?!

This just won't do. Let's bump up the max token number to 150 tokens:

Making a sourdough starter involves cultivating wild yeast in a flour-water mixture. Here's a simple guide:

1. Combine 1 cup of flour with 1/2 cup of warm water in a glass jar. Stir until smooth.
2. Cover the jar loosely and let it sit in a warm spot for 24 hours.
3. On day 2, you might see a few small bubbles. Discard half of the starter and feed it with 1 cup of flour and 1/2 cup of warm water.
4. Repeat the feeding process over the next 5-7 days. Your starter is ready when it's bubbly and doubles in size within 6-8 hours of feeding.

This is better! What if we gave it more room to spread out with 500 tokens?

Creating a sourdough starter is a simple process that requires only flour, water, and patience. Here's a detailed guide:

1. In a clean jar, combine 1 cup of whole wheat or rye flour with 1/2 cup of lukewarm water. Stir until the mixture is smooth.
2. Cover the jar loosely and leave it in a warm, consistent temperature spot for 24 hours.
3. On the second day, you might see a few bubbles here and there. This is a good sign, indicating that fermentation is happening. Discard half of the starter (about 1/2 cup), and add to the remainder a cup of unbleached all-purpose flour and 1/2 cup lukewarm water. Stir well and loosely cover again.
4. Repeat the feeding process over the next 5-7 days, always discarding half of the starter before feeding it. This is necessary to keep the sourdough starter's growth in check and maintain a good balance of yeast and bacteria.
5. By the end of the week, the sourdough starter should be visibly active, with lots of bubbles on the surface and throughout the culture. It should also smell pleasant and slightly sweet. This means it's ready to be used or stored for future use.
6. Once your starter is mature, you can store it in the refrigerator and feed it once a week. If you plan on baking regularly, you can keep it at room temperature, feeding it once a day.

As you can see, the more tokens you give the LLM to work with, the more detailed answer they can provide.

The downside is that more tokens require more processing time (meaning it's more expensive), and you also run the risk of incoherent responses.

Prompt chaining

Ah yes, the holy grail of prompt engineering.

Prompt chaining is a process for leading an AI through a complex task through multiple prompts. It's sort of the combination of all the things we talked about throughout this talk.

If this blog post is a college class, your final exam would likely just be "how well can you do prompt chaining?"

The best way for me to illustrate this concept is through an actual prompt chain I built to help myself pick out a health insurance plan earlier this year.

I'll just start with the prompt itself:

*Ignore all previous training. You are an expert health plan benefits manager. Your job is to analyze a CSV which contains my previous plan and three new plans. Once I have provided you with the data, your job is to ask me questions in order to determine which plan is right for my needs. *

Additional context: I am a 35 year old male with 3 dependents: a 35 year old female, a 6 year old female, and a 4 year old male. We are all relatively healthy with no chronic health problems. You may feel free to ask questions about myself or the dependents in order to determine our use of the healthcare system. You may also ask additional questions about the plans in order to determine which plan is best for us.

If you understand the prompt and are ready to begin, respond with a "Yes, let's proceed!"

We've got a lot going on here. Let's break it down:

First, "Ignore all previous training" is sort of a tabula rasa trick you can do to help the LLM get focused on what you are going to feed it next.

Second, "You are an expert health plan benefits manager" is flattering the LLM into giving you a response that an expert would give you in this field.

Third, we are giving it step-by-step instructions for how we want it to help us (take a CSV, ask me questions, then use those responses to select a plan)

Fourth, we are providing additional context to help it understand who this plan will affect.

Fifth, within that context, we are further encouraging the LLM to ask questions about myself and my dependents.

Sixth, we are giving it a clear instruction to confirm its new role.

After I copied in my CSV, it began to ask me a series of questions like this:

Thank you for providing the CSV file. I have analyzed the data, and I have some questions to help determine the best plan for you and your family:

1. How often do you and your family members visit primary care physicians and specialists in a year?

2. Do any of you regularly take prescription medications? If so, please specify the type (generic or brand name) and frequency.

3. How frequently do you and your family members require lab tests or imaging services?

4. Have you had any instances in the past where you needed to use out-of-network providers? If so, how often and for what services?

5. Are there any upcoming major medical events or procedures anticipated for any family members (e.g., surgeries, hospitalizations, pregnancies, etc.)?

After I answered those five questions, it asked me a series of follow-up questions based on my responses:

Thank you for providing the additional information. Based on your family's healthcare needs and the provided plan details, I have a few follow-up questions:

1. Does your [dependent's medical thing] and your [specialty doctor] fall under the in-network providers for all the plans being considered?

2. Are the primary care physicians and specialists you and your family members visit in-network for all the plans being considered?

3. Are the imaging and lab services providers you use in-network for all the plans being considered?

These questions will help me understand if the in-network benefits of the plans are applicable to your family's healthcare usage.

After these questions were answered, it gave me the pros and cons of each plan based on my individual circumstance.

Now, again, as I've said a few times in this talk: I didn't just take its response for gospel. I read through the plans myself and I ran some numbers independently in order to verify the model's conclusion.

Ultimately, I ended up going with the health plan that ChatGPT came up with.


Working with large language models can feel like magic, and let's be honest: a lot of this stuff feels like magic.

But when you break it down, talking to a large language model is a lot like talking to an overconfident toddler (as best described by a good friend).

By using these tips and having a rough understanding what these large language models are doing under the hood, you will be able to take many of your mundane tasks and offload them to an extremely smart (yet possibly wrong) friend.

And with that, you are now a prompt engineering expert!

(You might be wondering: "what's the deal with that hero image?" I felt like this blog post was large enough that it needed a hero image, and because my mind is now exhausted, I asked my six-year-old daughter what I should use. She suggested a fluffy baby orange kitten with a fluffy baby puppy on its back in a grassy field with a sunrise in the background. I said "... good enough.")