If you hang out in the AI/LLM sphere for a while, you’ll inevitably stumble on the token gap. AI providers like Anthropic, Google, OpenAI, et al., are selling you tokens at a significant loss. This is, of course, nothing new. Amazon was famously accused of new math when they perpetually lost hundreds of millions of dollars a year before they didn’t. Uber was significantly cheaper than a taxi until they weren’t. There’s a general sense that the same pattern will play out with AI, and costs will skyrocket when one of the competitors “wins”. So there’s a big push for local models amongst power users. If you have the time, money, and an electrician, you can get close to Sonnet 4.5 quality with a home rig.
I think history will repeat, but a different branch of history than some are predicting. Let’s start with Google and Gemini. After getting caught flat-footed, Google has quickly closed the gap with their Gemini 3 models. We can debate where they fall comparatively to frontier models from OpenAI and Anthropic, but they weren’t even part of the conversation last year. So much so that Apple has selected Google’s AI as the basis of Apple’s Foundation Models. Google and Apple can subsidize the token deficit effectively indefinitely by amortizing the costs across their customer base. Take Apple One, Apple’s “everything” subscription. I pay $40 a month for it. I don’t use $40 a month worth of the service, but it would cost me more than $40 a month for the services I do regularly use, and I get the benefit of easily dipping in and out of the other offerings. It’s like insurance; the user base all pays in, and heavy users subsidize lighter users.
OpenAI and Anthropic can’t indefinitely lose money every quarter, and unlike the examples of Amazon and Uber, the landscape is different this time around. Amazon didn’t have a Google to compete against. Google, Apple, and Microsoft can also amortize the cost across additional revenue streams. They don’t have to win; they just have to wait.
So to me, the future looks a lot like the present. Certain technically inclined folks will take the time and effort to set up local LLM machines, just like they have with Linux. They will have far more flexibility, far fewer constraints and it will be cheaper. Most people, including most other technically inclined people will be subscribed to Google, Apple and Microsoft with the slim possibility of a fourth competitor emerging.