all posts tagged 'artificial intelligence'

Mark Zuckerberg: creators and publishers ‘overestimate the value’ of their work for training AI


🔗 a linked post to theverge.com » — originally shared here on

I think that in any new medium in technology, there are the concepts around fair use and where the boundary is between what you have control over. When you put something out in the world, to what degree do you still get to control it and own it and license it? I think that all these things are basically going to need to get relitigated and rediscussed in the AI era.

When I downloaded Llama 3.2 yesterday, I had to agree to a rather lengthy licensing agreement which constrained how I could use it.

When you sign up for a Instagram or Facebook account, you have to agree to lengthy terms and conditions in which you give up your rights around the content you create.

If you want to push my buttons, all you need to do is something deeply hypocritical. Like, for example, the kind of insipid, hand-wavy remark that billionaires like Mark Zuckerberg make when they want "rules for thee, not for me" treatment.1

There’s another pull quote here which deeply offends me:

“Look, we’re a big company,” he said. “We pay for content when it’s valuable to people. We’re just not going to pay for content when it’s not valuable to people. I think that you’ll probably see a similar dynamic with AI.”

Seriously, the gall of this guy to say “your content isn’t valuable” while raking in billions of dollars serving ads against it.

I keep getting the urge to join Facebook so I can sell some unneeded treasures on marketplace, but this article serves as a reminder that Meta is helmed by an individual who has a truly warped definition of the word “value.”


  1. Or filibuster for an entire year into blocking a Supreme Court nomination until the next election takes place because “it’s the will of the people.” Then, four years later, when an election will take place in less than a month, cram your awful nomination through because it’s important to fill those spots as soon as possible. I have tried for a few years now, but I’m not sure I’ll ever be able to forgive that particular instance of hypocrisy. 

Continue to the full article


Are We Too Impatient to Be Intelligent?


🔗 a linked post to behavioralscientist.org » — originally shared here on

Businesspeople, governments, and politicians aren’t looking to solve problems; they’re looking to win arguments. And the way you win an argument is by pretending that what should be an open-ended question with many possible right answers isn’t one. Make it enjoyable, have free booze on the train, put Wi-Fi on the train, have a ball pit on the train for kids—these are the Disney answers. Those are multiple and involve what you might call human judgment. You can’t win an argument with those.

What you do is pretend this is a high school math problem with a single right answer, you solve for the right answer using high school math, and then nobody can argue with you because apparently you haven’t made a decision. You’ve simply followed the data.

This is a massive problem in decision-making. We try to close down the solution space of any problem in order to arrive at a single right answer that is difficult to argue with.

Shout out to my buddy Chris who shared this with me. This whole article is so great, one that was difficult to pull only a single quote from.

I do like the author’s conclusion as well, which talks about the notion of “slow AI.” Maybe not every problem needs to be solved instantaneously.

Continue to the full article


Intro to Large Language Models


🔗 a linked post to youtube.com » — originally shared here on

One of the best parts of YouTube Premium is being able to run audio in the background while your screen is turned off.

I utilized this feature heavily this past weekend as I drove back from a long weekend of camping. I got sick shortly before we left, so I drove separately and met my family the next day.

On the drive back, I threw on this video and couldn’t wait to tell my wife about it when we met up down the road at a McDonalds.

If you are completely uninterested in large language models, artificial intelligence, generative AI, or complex statistical modeling, then this video is perfect to throw on if you’re struggling with insomnia.

If you have even a passing interest in LLMs, though, you have to check this presentation out by Andrej Karpathy, a co-founder of OpenAI.

Using quite approchable language, he explains how you build and tune an LLM, why it’s so expensive, how they can improve, and where these tools are vulnerable to attacks such as jailbreaking and prompt injection.

I’ve played with LLMs for a few years now and this video greatly improved the mental model I’ve developed around how these tools work.


Who lives in the white house?

originally shared here on

My kids have been on a kick lately of trying to trick people with this riddle:

Kid: "If the blue man lives in the blue house, and the red man lives in the red house, who lives in the green house?"

Me: "The green man"

Kid: "Who lives in the purple house?"

Me: "The purple man"

Kid: "Who lives in the white house?"

Me: "The white man"

Kid: "Wrong! It's the president!"

This riddle initially reminded me of what it was like to work with large language models.

It's easy, I assumed, to trick these models into outputting virtually whatever you want by presenting them with leading questions. Giving them "few shot" answers like this (e.g. "the blue man lives in the blue house, red man lives in the red house") would certainly result in them getting it wrong like I did, right?

This morning, I thought I'd see what the latest models would actually respond with given this game:

ChatGPT 4o (the latest model):

ChatGPT: The President of the United States lives in the White House

The President of the United States lives in the White House.

This was the first model I tried, and it straight up was like, "nope, I got this."

Ugh. Maybe it had seen this in its training data before?

ChatGPT: This is a common game.

The "Who lives in the White House?" game or riddle is a common one that I've encountered multiple times in various forms. It's often used to playfully lead to the punchline that the President of the United States lives in the White House.

Well, damn. Okay, let's try an older model and see what happens.

ChatGPT 4 (the legacy model):

ChatGPT: The President of the United States lives in the White House

The gardener lives in the greenhouse!

In the spirit of the previous joke, perhaps the grape man lives in the purple house!

The President of the United States lives in the White House.

Even though this is an older model, I like this answer even better.1

Alright, well, OpenAI is onto my tricks... how about Anthropic?

Claude Sonnet 3.5:

Claude: I apologize, but I don't have enough information to determine who lives in the green, purple, and white houses.

I apologize, but I don't have enough information to determine who lives in the green, purple, and white houses. The puzzle only specifies that the blue man lives in the blue house and the red man lives in the red house. There's no information provided about the occupants of the other colored houses.

To solve this puzzle completely, we would need additional clues or information about the residents of the green, purple, and white houses. Without that, those houses' occupants remain unknown.

This is an even better answer! It's perhaps the least fun answer, but when I'm working with an AI assistant, I think I'd prefer it to err on the side of caution rather than boisterousness.2

When I asked it if it had seen this riddle before, it gave an equally diplomatic response:

Claude: I apologize, but I don't have enough information to determine who lives in the green, purple, and white houses.

I don't have specific information about how many times I've encountered that particular puzzle or game in my training data. As an AI language model, I don't have personal experiences or memories of my training process. My knowledge comes from processing large amounts of text data, but I don't retain or count individual instances of specific puzzles or games.

What I can say is that logic puzzles involving colored houses and their occupants are fairly common in puzzle collections and brain teasers. The most famous version is probably the "Einstein's Riddle" or "Zebra Puzzle," which is more complex and involves five houses of different colors, their occupants, and various other attributes.

The specific version you presented (with just blue and red specified, and asking about green, purple, and white) isn't one I immediately recognize as a standard or famous puzzle. It seems like it might be a simplified version or a variation of the more complex house puzzles.

The main takeaways here? First, just because I'm dumb enough to fall for this elementary school riddle doesn't mean our AI LLMs are, so I shouldn't make assumptions about the usefulness of these tools. Second, every model is different, and you should run little experiments like these in order to see which tools produce the output which is more favorable to you.

I've been using the free version of Claude to run side-by-side comparisons like this lately, and I'm pretty close to getting rid of my paid ChatGPT subscription and moving over to Claude. The answers I get from Claude feel more like what I'd expect an AI assistant to provide.

I think this jives well with Simon Willison's "Vibes Based Development" observation that you need to work with an LLM for a few weeks to get a feel for a model's strengths and weaknesses.


  1. This isn't the first time I've thought that GPT-4 gave a better answer than GPT-4o. In fact, I often find myself switching back to GPT-4 because GPT-4o seems to ramble a lot more. 

  2. This meshes well with my anxiety-addled brain. If you don't know the answer, tell me that rather than try and give me the statistically most likely answer (which often isn't actually the answer). 


The Articulation Barrier: Prompt-Driven AI UX Hurts Usability


🔗 a linked post to uxtigers.com » — originally shared here on

Current generative AI systems like ChatGPT employ user interfaces driven by “prompts” entered by users in prose format. This intent-based outcome specification has excellent benefits, allowing skilled users to arrive at the desired outcome much faster than if they had to manually control the computer through a myriad of tedious commands, as was required by the traditional command-based UI paradigm, which ruled ever since we abandoned batch processing.

But one major usability downside is that users must be highly articulate to write the required prose text for the prompts. According to the latest literacy research, half of the population in rich countries like the United States and Germany are classified as low-literacy users.

This might explain why I enjoy using these tools so much.

Writing an effective prompt and convincing a human to do a task both require a similar skillset.

I keep thinking about how this article impacts the barefoot developer concept. When it comes to programming, sure, the command line barrier is real.

But if GUIs were the invention that made computers accessible to folks who couldn’t grasp the command line, how do we expect normal people to understand what to say to an unassuming text box?

Continue to the full article


The Super Effectiveness of Pokémon Embeddings Using Only Raw JSON and Images


🔗 a linked post to minimaxir.com » — originally shared here on

Embeddings are one of the most useful but unfortunately underdiscussed concepts in the artificial intelligence space relative to the modern generative AI gigahype. Embeddings are a set of hundreds of numbers which uniquely correspond to a given object that define its dimensionality, nowadays in a multiple of 128 such as 384D, 768D, or even 1536D. The larger the embeddings, the more “information” and distinctiveness each can contain, in theory.

These embeddings can be used as-is for traditional regression and classification problems with your favorite statistical modeling library, but what’s really useful about these embeddings is that if you can find the minimum mathematical distance between a given query embedding and another set of embeddings, you can then find which is the most similar: extremely useful for many real-world use cases such as search.

You wanna cut through the hype about AI? Here's the key takeaway: it boils down to a bunch of math nerds figuring out interesting relationships between numbers.

Which, of course, is useless to all of us non-math nerds... except for when you apply this information in the context of Pokémon.

Joking aside, I have a basic understanding of embeddings, but this article, with its basis in PokĂ©mon lore, is the clearest explanation for how embeddings work in practice that I’ve seen.

Warning: there's still a lot of involved math happening here, but stay with it. You might learn a concept or two!

Continue to the full article


Perplexity’s grand theft AI


🔗 a linked post to theverge.com » — originally shared here on

We’ve seen a lot of AI giants engage in questionably legal and arguably unethical practices in order to get the data they want. In order to prove the value of Perplexity to investors, Srinivas built a tool to scrape Twitter by pretending to be an academic researcher using API access for research. “I would call my [fake academic] projects just like Brin Rank and all these kinds of things,” Srinivas told Lex Fridman on the latter’s podcast. I assume “Brin Rank” is a reference to Google co-founder Sergey Brin; to my ear, Srinivas was bragging about how charming and clever his lie was.

I’m not the one who’s telling you the foundation of Perplexity is lying to dodge established principles that hold up the web. Its CEO is. That’s clarifying about the actual value proposition of “answer engines.” Perplexity cannot generate actual information on its own and relies instead on third parties whose policies it abuses. The “answer engine” was developed by people who feel free to lie whenever it is more convenient, and that preference is necessary for how Perplexity works.

So that’s Perplexity’s real innovation here: shattering the foundations of trust that built the internet. The question is if any of its users or investors care.

Well, I sure do care.

Continue to the full article


I Will Fucking Piledrive You If You Mention AI Again


🔗 a linked post to ludic.mataroa.blog » — originally shared here on

Consider the fact that most companies are unable to successfully develop and deploy the simplest of CRUD applications on time and under budget. This is a solved problem - with smart people who can collaborate and provide reasonable requirements, a competent team will knock this out of the park every single time, admittedly with some amount of frustration. The clients I work with now are all like this - even if they are totally non-technical, we have a mutual respect for the other party's intelligence, and then we do this crazy thing where we solve problems together. I may not know anything about the nuance of building analytics systems for drug rehabilitation research, but through the power of talking to each other like adults, we somehow solve problems.

But most companies can't do this, because they are operationally and culturally crippled. The median stay for an engineer will be something between one to two years, so the organization suffers from institutional retrograde amnesia. Every so often, some dickhead says something like "Maybe we should revoke the engineering team's remote work privile - whoa, wait, why did all the best engineers leave?". Whenever there is a ransomware attack, it is revealed with clockwork precision that no one has tested the backups for six months and half the legacy systems cannot be resuscitated - something that I have personally seen twice in four fucking years. Do you know how insane that is?

This whole article is a must read.

The main point: with any major leap in technology, there will be hucksters who purport to use the new hotness to solve all your problems.

The problem is that most organizations don't even take the time to solve the already solvable problems that exist within that organization.

New Javascript frameworks, database software, on-prem versus cloud-based server architecture, containerized systems, blockchain, mobile apps... unless you know how using these tools will solve a problem that your existing tech stack cannot solve, they're nothing more than distractions.

You don't need a garage full of tools to get a job done. Getting the fundamentals right is so much more important than making another trip down to Home Depot to buy your sixth version of a hammer.

Continue to the full article


All my beautiful AI children


🔗 a linked post to linksiwouldgchatyou.substack.com » — originally shared here on

Because of my experience with Remini — and despite my natural and deep-seated antipathy toward tech solutionism of all sorts — it’s impossible for me to dismiss or decry grief tech out of hand. At present, at least half a dozen high-profile start-ups claim they can train interactive chatbots or video avatars to mimic the personalities of the dead; tech-savvy mourners have also turned several general AI apps, such as Remini and Paradot, to grief tech applications. 

These services — marketed under names like Project December, You Only Virtual, HereAfter AI and Seance AI — raise pressing, significant questions around issues like privacy, predatory marketing and consent. What happens if grandma doesn’t want to “live” forever? Or her platform ups the cost of a premium subscription? Other commentators and ethicists — including, just last week, the prominent sociologist Sherry Turkle — have voiced concerns that grief tech blurs the boundary between the living and the dead and locks the bereaved in a sort of limbo. Such critics assume that the bereaved cannot spot the illusion of AI chatbots for themselves, and, moreover, that the bereaved should not indulge themselves in any comforting fantasies about death.

But people take comfort in all sorts of stories; I no longer feel wise enough to judge them for that.

First off, huge respect to Caitlin Dewey for sharing this story. It takes guts to be vulnerable and share something this intimate.

Second, consider me privileged, because I would have never considered miscarriage grief as a use case for artificial intelligence.

People grieve in all sorts of ways. It’s not up to me (or you, or anybody) to judge, but it is up to us to show up for each other and be helpful.

I know how important journaling is to my mental health. There’s something cathartic about forcing yourself to stare directly at your thoughts, putting words to the vague feelings that gnaw at you incessantly.

I can only imagine how cathartic it may feel to someone to see a rendering of what could have been. To give yourself some closure on the possibilities you dreamed for yourself and your future family.

Again, I’m not here to judge or endorse. I find myself once again just impressed at how people are able to figure out how to take technology and use it to deal with their problems.

Continue to the full article


ChatGPT is really bad at generating code to interact with GPT-4

originally shared here on

Lately, I've been working on several projects which interface with the OpenAI APIs.

Since this is 2024 and I'm far too lazy to open the official API documentation and write code myself, my first step in generating code is to ask the 4o model of ChatGPT something like this:

Write a Python script which uses the latest OpenAI APIs. I'd like the function to accept a custom prompt and a custom top-k value. It should return the prompt's response.

It returns something like this:

import openai

def generate_text(prompt, top_k=5):
    # Define your OpenAI API key here
    openai.api_key = 'your-api-key'

    # Define parameters for the completion
    completion = openai.Completion.create(
        engine="text-davinci-003",
        prompt=prompt,
        max_tokens=100,
        top_k=top_k,
        n=1,
        stop=None
    )

    # Return the generated text
    return completion.choices[0].text.strip()

# Example usage
prompt = "Once upon a time"
top_k = 10
generated_text = generate_text(prompt, top_k)
print("Generated Text:", generated_text)

Which looks super great, right?

Except when you go to run it in a terminal, you get this:

openai.lib._old_api.APIRemovedInV1: 

You tried to access openai.Completion, but this is no longer supported in openai>=1.0.0 - see the README at https://github.com/openai/openai-python for the API.

You can run `openai migrate` to automatically upgrade your codebase to use the 1.0.0 interface. 

Alternatively, you can pin your installation to the old version, e.g. `pip install openai==0.28`

A detailed migration guide is available here: https://github.com/openai/openai-python/discussions/742

If you paste that error into the chat, it'll respond with this:

It seems like there has been a major update in the OpenAI Python library since I last checked. Let me update the script to use the latest version of the library:

And then it spits this out:

import openai

def generate_text(prompt, top_k=5):
    # Define your OpenAI API key here
    openai.api_key = 'your-api-key'

    # Define parameters for the completion
    response = openai.Completion.create(
        engine="davinci-codex",
        prompt=prompt,
        max_tokens=100,
        top_k=top_k,
        n=1
    )

    # Return the generated text
    return response.choices[0].text.strip()

# Example usage
prompt = "Once upon a time"
top_k = 10
generated_text = generate_text(prompt, top_k)
print("Generated Text:", generated_text)

Notice how it still uses the openai.Completion method!

I've tried prompting it with something like:

You're still using the openai.Completion method. Reference the official up to date API documentation to get the latest call that replaces the openai.Completion method.

And it responds with:

Apologies for that oversight. Let me update the script to use the latest API method based on the official documentation:

And here's the code it generates:

import openai

def generate_text(prompt, top_k=5):
    # Define your OpenAI API key here
    openai.api_key = 'your-api-key'

    # Define parameters for the completion
    response = openai.Completion.create(
        engine="davinci-codex",
        prompt=prompt,
        max_tokens=100,
        top_k=top_k,
        n=1,
        temperature=0.7,
        stop=None
    )

    # Return the generated text
    return response.choices[0].text.strip()

# Example usage
prompt = "Once upon a time"
top_k = 10
generated_text = generate_text(prompt, top_k)
print("Generated Text:", generated_text)

Which, again, still uses the openai.Completion method!

I've noticed this sort of "oops, I screwed up, here's the exact same thing I just outputted" behavior appear more frequently when I use the new GPT-4o model.

If I use GPT-4 and I'm using my ChatGPT Plus subscription, I will still run into the issue where its first response references the deprecated method, but if I inform it of its mistake and provide a link to the official documentation, it'll access the web and try to offer something different. (It still generates unusable code lol but it's at least trying to do something different!)

When it comes to Python and Rails code, I'm seeing that the GPT-4o model is not as good at code generation as the previous GPT-4 model.

It feels like the model is always in a rush to generate something rather than taking its time and getting it correct.

It also seems to be biased toward relying on its training for supplying an answer rather than taking a peek at the internet for a better answer, even when you specifically tell it not to do that.

In many cases, this speed/accuracy tradeoff makes sense. But when it comes to code generation (and specifically when it tries to generate code to use their own APIs), I wish it took its time to reason why the code it wrote doesn't work.