Spoiler Alert: It's All a Hallucination

🔗 a linked post to community.aws » — originally shared here on 2024-02-21

LLMs treat words as referents, while humans understand words as referential. When a machine “thinks” of an apple (such as it does), it literally thinks of the word apple, and all of its verbal associations. When humans consider an apple, we may think of apples in literature, paintings, or movies (don’t trust the witch, Snow White!) — but we also recall sense-memories, emotional associations, tastes and opinions, and plenty of experiences with actual apples.

So when we write about apples, of course humans will produce different content than an LLM.

Another way of thinking about this problem is as one of translation: while humans largely derive language from the reality we inhabit (when we discover a new plant or animal, for instance, we first name it), LLMs derive their reality from our language. Just as a translation of a translation begins to lose meaning in literature, or a recording of a recording begins to lose fidelity, LLMs’ summaries of a reality they’ve never perceived will likely never truly resonate with anyone who’s experienced that reality.

And so we return to the idea of hallucination: content generated by LLMs that is inaccurate or even nonsensical. The idea that such errors are somehow lapses in performance is on a superficial level true. But it gestures toward a larger truth we must understand if we are to understand the large language model itself — that until we solve its perception problem, everything it produces is hallucinatory, an expression of a reality it cannot itself apprehend.

This is a helpful way to frame some of the fears I’m feeling around AI.

By the way, this came from a new newsletter called VectorVerse that my pal Jenna Pederson launched recently with David Priest. You should give it a read and consider subscribing if you’re into these sorts of AI topics!

Continue to the full article →