What's going on in those vast oceans of GPUs that enables providers to give you a 10x discount on input tokens? What are they saving between requests? It's not a case of saving the response and re-using it if the same prompt is sent again, it's easy to verify that this isn't happening through the API. Write a prompt, send it a dozen times, notice that you get different responses each time even when the usage section shows cached input tokens.
Not satisfied with the answers in the vendor documentation, which do a good job of explaining how to use prompt caching but sidestep the question of what is actually being cached, I decided to go deeper. I went down the rabbit hole of how LLMs work until I understood the precise data providers cache, what it's used for, and how it makes everything faster and cheaper for everyone.
After reading the Joan Westenberg article I posted yesterday, I decided I’m going to read more technical articles and focus my attention on them.
This post from the ngrok blog was very helpful in explaining how LLMs work up through the attention phase, which is where prompt caching happens.
It also got me to go down a rabbit hole to remember how matrix multiplication works. I haven’t heard the phrase “dot product” since high school.
Continue to the full article
→
Here’s the paradox that makes this pattern particularly poignant. We’ve made extraordinary progress in software capabilities. The Apollo guidance computer had 4KB of RAM. Your smartphone has millions of times more computing power. We’ve built tools and frameworks that genuinely make many aspects of development easier.
Yet demand for software far exceeds our ability to create it. Every organization needs more software than it can build. The backlog of desired features and new initiatives grows faster than development teams can address it.
This tension—powerful tools yet insufficient capacity—keeps the dream alive. Business leaders look at the backlog and think, “There must be a way to go faster, to enable more people to contribute.” That’s a reasonable thought. It leads naturally to enthusiasm for any tool or approach that promises to democratize software creation.
The challenge is that software development isn’t primarily constrained by typing speed or syntax knowledge. It’s constrained by the thinking required to handle complexity well. Faster typing doesn’t help when you’re thinking through how to handle concurrent database updates. Simpler syntax doesn’t help when you’re reasoning about security implications.
Continue to the full article
→
I’m coming to terms with the high probability that AI will write most of my code which I ship to prod, going forward. It already does it faster, and with similar results to if I’d typed it out. For languages/frameworks I’m less familiar with, it does a better job than me.
It feels like something valuable is being taken away, and suddenly. It took a lot of effort to get good at coding and to learn how to write code that works, to read and understand complex code, and to debug and fix when code doesn’t work as it should.
It’s been a love-hate relationship, to be fair, based on the amount of focus needed to write complex code. Then there’s all the conflicts that time estimates caused: time passes differently when you’re locked in and working on a hard problem.
Now, all that looks like it will be history.
Early in my career, I helped start a company that conducted autonomous vehicle research. As increasingly complex driving tasks were able to be automated, I’d think about how this technology would one day render truck drivers useless. Which quickly turned into wondering when this tech would make me useless.
There’s no sitting still when it comes to software engineering. Every ten years or so, a new breakthrough comes along and requires folks to make a decision: do I evolve my engineering practice to stay up with the modern times, or do I double down on my current practice and focus on the fundamentals?
The choice comes down to what you value. Are you someone who enjoys artisanally crafting code, painstakingly optimizing each line to result in a beautiful tool? Are you someone who smashes things until they make the shape of a tool that helps someone accomplish a task?
When it comes to our economic structure, however, it doesn't matter what you value, it matters what someone is willing to pay you to solve their problem.
Some employers will value bespoke, artisanal ("clean") code, but I bet most will not care about what the code looks like. They will want whoever can quickly smash something into the shape of the tool that gets the job done.
As they say: don't hate the player, hate the game.
Continue to the full article
→
The programming language is called "cursed". It's cursed in its lexical structure, it's cursed in how it was built, it's cursed that this is possible, it's cursed in how cheap this was, and it's cursed through how many times I've sworn at Claude.
Absolutely dying at this.
Continue to the full article
→
I’ve cut social media almost entirely out of my life (10/10 recommend), but I still drop into LinkedIn every so often. And honestly? I get exhausted fast by all the heavy, depressing posts.
Yes, there’s a lot of real suffering and injustice in the world. If you’re in the thick of it right now, I hope you’re able to keep hanging in there.
But if you’d like a little break from the bleak hellscape that is 21st-century journalism, check out the latest issue of Fix the News. Or, if you just want the highlights, here are a few that stood out to me:
Billions of people have gained clean water, sanitation, and hygiene in the last nine years. (Billions with a B.)
In the 12 months prior to June, Africa imported over 15GW of solar panels. Sierra Leone alone imported enough to cover 65% of its entire generating capacity.
Google estimates the median LLM prompt uses 0.24 Wh (about nine seconds of TV), emitting 0.03 g of COâ‚‚ and five drops of water. (How many of you leave the TV on while doing chores?)
Wildfires are terrifying, but between 2002 and 2021, global burned area actually fell 26%.
A gentle reminder: news and social media are designed to keep you engaged by stoking fear, outrage, and anxiety. That cycle is hard to break, and a lot of my friends worry that looking away even for a moment means we will collectively slide into totalitarianism and ruin.
That’s a lot of weight to carry alone. Yes, we need to stay vigilant and hold leaders accountable, but we can’t live paralyzed by fear. There are countless good people stepping up, trying to make the world better (including many of you). Try to hold onto that truth alongside the bleak!
Continue to the full article
→
I think we’ve barely scratched the surface of AI as intellectual partner and tool for thought . Neither the prompts, nor the model, nor the current interfaces – generic or tailored – enable it well. This is rapidly becoming my central research obsession, particularly the interface design piece. It’s a problem I need to work on in some form.
When I read Candide in my freshman humanities course, Voltaire might have been challenging me to question naïve optimism, but he wasn’t able to respond to me in real time, prodding me to go deeper into why it’s problematic, rethink my assumptions, or spawn dozens of research agents to read, synthesise, and contextualise everything written on Panglossian philosophy and Enlightenment ethics.
In fact, at eighteen, I didn’t get Candide at all. It wasn’t contextualised well by my professor or the curriculum, and the whole thing went right over my head. I lacked a tiny thinking partner in my pocket who could help me appreciate the text; a patient character to discuss, debate, and develop my own opinions with.
I can’t agree more. I would love to help as well in this area of research. It sounds extremely rewarding.
Continue to the full article
→
The old timers who built the early web are coding with AI like it's 1995.
Think about it: They gave blockchain the sniff test and walked away. Ignored crypto (and yeah, we're not rich now). NFTs got a collective eye roll.
But AI? Different story. The same folks who hand-coded HTML while listening to dial-up modems sing are now vibe-coding with the kids. Building things. Breaking things. Giddy about it.
We Gen X'ers have seen enough gold rushes to know the real thing. This one's got all the usual crap—bad actors, inflated claims, VCs throwing money at anything with "AI" in the pitch deck. Gross behavior all around. Normal for a paradigm shift, but still gross.
The people who helped wire up the internet recognize what's happening. When the folks who've been through every tech cycle since gopher start acting like excited newbies again, that tells you something.
Really feels weird to link to a LinkedIn post, but if it’s good enough for Simon, it’s good enough for me!
It’s not just Gen Xers who feel it. I don’t think I’ve been as excited about any new technology in years.
Playing with LLMs locally is mind-blowingly awesome. There’s not much need to use ChatGPT when I can host my own models on my own machine without fearing what’ll happen to my private info.
Continue to the full article
→
Put simply, writing with AI reduces the maximum strain required from your brain. For many commentators responding to this article, this reality is self-evidently good.“The spreadsheet didn’t kill math; it built billion-dollar industries. Why should we want to keep our brains using the same resources for the same task?”
My response to this reality is split. On the one hand, I think there are contexts in which reducing the strain of writing is a clear benefit. Professional communication in email and reports comes to mind. The writing here is subservient to the larger goal of communicating useful information, so if there’s an easier way to accomplish this goal, then why not use it?Â
But in the context of academia, cognitive offloading no longer seems so benign. In a learning environment, the feeling of strain is often a by-product of getting smarter. To minimize this strain is like using an electric scooter to make the marches easier in military boot camp; it will accomplish this goal in the short term, but it defeats the long-term conditioning purposes of the marches.
I wrote many a journal entry in college complaining about this exact point, except we were still arguing about graphing calculator and laptop use.
Now that I’m older, I understand the split that Cal talks about here.
When I’m writing software to accomplish a task for work, then it’s more important for me to spend my brain energy on building the context of the problem in my head.
When I’m writing an essay and trying to prove that I understand a concept, then it’s more important for me to get the words out of my head and onto paper. Then, I can use tools to help me clean it up later.
Maybe this points to a larger problem I’ve had with our education system. Imagine a spectrum of the intent of college. The left end of the spectrum represents “learning how to critically think about ideas”. The right end represents “learning skills that will help you survive in the real world”.
When someone makes fun of a film studies major, it’s because their evaluation of the spectrum is closer to the right end.
When someone makes fun of students using ChatGPT for writing their essays for them, it’s because their evaluation is closer to the left.
Continue to the full article
→
The real threat to creativity isn’t a language model. It’s a workplace that rewards speed over depth, scale over care, automation over meaning. If we’re going to talk about what robs people of agency, let’s start there. Let’s talk about the economic structures that pressure people into using tools badly, or in ways that betray their values. Let’s talk about the lack of time, support, mentorship, and trust. Not the fact that someone ran a prompt through a chatbot to get unstuck. Where is the empathy? Where is your support for people who are being tossed into the pit of AI and instructed to find a way to make it work?
So sure, critique the tools. Call out the harm. But don’t confuse rejection with virtue. And don’t assume that the rest of us are blind just because we’re using the tools you’ve decided are beneath you.
(via Jeffrey)
Today, quite suddenly, billions of people have access to AI systems that provide augmentations, and inflict amputations, far more substantial than anything McLuhan could have imagined. This is the main thing I worry about currently as far as AI is concerned. I follow conversations among professional educators who all report the same phenomenon, which is that their students use ChatGPT for everything, and in consequence learn nothing. We may end up with at least one generation of people who are like the Eloi in H.G. Wells’s The Time Machine, in that they are mental weaklings utterly dependent on technologies that they don’t understand and that they could never rebuild from scratch were they to break down.
Before I give a counterpoint, I do want to note the irony that even now people do not understand how this stuff works. It’s math, all the way down. It shouldn’t work, frankly… but it does!
I think that is so beautiful. We don’t really understand much about our universe, like dark matter, gravity, all number of naturally-occurring phenomena.
But just because we don’t understand it doesn’t mean we can’t harness it to do amazing things.
As far as the students using ChatGPT… I mean, yeah, it’s painfully obvious to most teachers I chat with when their kids use the tech to get by.
I would posit, though, that this is the history of education in general. We teach students truths about the world, and they go out and show us how those truths are not entirely accurate anymore.
Sure, some kids will certainly use ChatGPT to compose an entire essay, which circumvents the entire point of writing an essay in the first place: practicing critical thinking skills. That’s bad, and an obvious poor use of the tool.
But think of the kids who are using AI to punch up their thoughts, challenge their assumptions with unconsidered angles, and communicate their ideas with improved clarity. They’re using the tool as intended.
That makes me so excited about the future. That’s what I hope teachers lean into with artificial intelligence.
(via Simon)
Continue to the full article
→