The Inference Subsidy

By Bustah Ofdee Ayei · April 1, 2026

A developer's monthly AI stack in 2026: Claude Max at $200, Cursor at $20, GitHub Copilot at $10. That's $230 a month. It feels expensive. It is, in absolute terms, less than half of what it costs to serve them. Somewhere between Sand Hill Road and a data center built next to a decommissioned nuclear plant, someone is covering the difference.

That someone is venture capital. And the arrangement has a name, even if nobody in AI marketing uses it: a subsidy. Every major AI coding tool on the market today is priced below the cost of delivering it. Not by a little. By multiples. The business model depends on the assumption that prices can be raised later, once developers have restructured their workflows, their teams, and their expectations around tools that were never meant to stay this cheap.

We've seen this playbook before.

The Math That Doesn't Work

The most concrete number in this story is also the most alarming. Anthropic's Claude Max plan costs $200 per month. A power user on that plan typically consumes $5,000 to $8,000 in compute monthly — Anthropic's own internal data.1 The worst documented case: a single Max subscriber consumed $51,291 in compute in one month on a $200 plan.1 Anthropic ate the other $51,091. That's not a rounding error. That's a business model that requires external funding to survive.

The per-dollar economics are worse at OpenAI. In 2025, OpenAI brought in $13.1 billion in revenue and posted $8 billion in losses.2 The company spent roughly $1.35 for every dollar it earned.3 Microsoft provides OpenAI with compute at below-market rates, and even that isn't enough to close the gap. Projected burn for 2026: $14 billion.2

Anthropic's margins swung from -94% in 2024 to roughly +40% in 2025 — a dramatic improvement, but still below break-even on compute-heavy workloads like coding.3 The company's projected break-even date: 2028, and only if inference costs continue declining at 10x per year and margins reach 77%.3 Those are optimistic assumptions stacked on optimistic assumptions.

One Max subscriber consumed $51,291 in compute in a single month on a $200 plan. Anthropic ate the other $51,091.

Cursor, the AI code editor that has become a staple of the modern developer stack, hit $2 billion in annualized revenue in March 2026.4 Impressive growth. But Cursor raised $3 billion in venture capital in 2025 alone.4 The company has never disclosed margins, and its entire product is built on upstream inference from Anthropic and OpenAI — providers that are themselves losing money on every heavy request. Cursor's economics are a derivative of someone else's subsidy. If the upstream providers raise API prices, Cursor's business model breaks unless it passes those costs through. And passing costs through means telling developers that their $20/month editor now costs $60. Or $120. Or whatever it actually costs to serve inference at the volume a professional developer consumes.

GitHub Copilot's pricing tells the story most clearly because Microsoft has already started unwinding the subsidy. When Copilot launched, it was $10/month for effectively unlimited use. By June 2025, that had become metered: the Pro plan still costs $10 but now includes only 300 premium requests. Need more? That'll be $0.04 per request.5 The Pro+ tier at $39/month buys 1,500 requests.5 What was unlimited became rationed. What was flat-rate became metered. The frog didn't jump out of the pot. The frog barely noticed the temperature change.

The Ride-Hailing Playbook

Between 2009 and 2021, Uber and Lyft combined burned through more than $25 billion in subsidies to capture the ride-hailing market.6 The mechanics were straightforward: price rides below the cost of providing them, fund the difference with venture capital, acquire users who reorganize their transportation habits around the subsidized price, then raise prices once the habits are set and the alternatives have atrophied.

It took more than a decade. Uber didn't post its first adjusted EBITDA profit until Q2 2021 — over twelve years after founding.7 Lyft hit the same milestone the same quarter. And then the prices moved. Rides that cost $10 during the subsidy era rose to $20 — a 50 to 100% increase from 2019 to 2022.6 By 2025, average fares had climbed another 9.6% year-over-year, from $21.58 to $23.66.8

The critical detail: Uber never announced that the subsidy era was over. There was no press release titled "Rides Will Now Cost What They Actually Cost." They just quietly, incrementally raised prices over three years until rides were double what users had been conditioned to expect. Users complained. Users stayed. The habits had formed. The car was sold. The bus route was no longer memorized. The alternative wasn't switching — it was rebuilding an entire transportation routine from scratch.

Uber never announced the subsidy era was over. They just raised prices until rides cost double. Users complained. Users stayed.

The AI coding tool market is running the same playbook, at the same scale, with the same bet: that by the time prices reflect actual costs, users will be too dependent to leave. Anthropic has raised $64 billion since its 2021 founding.9 Its February 2026 valuation of $380 billion at $14 billion in annualized revenue represents a 27x revenue multiple.9 For context, Microsoft trades at roughly 13x revenue. Google at 7x. The premium is entirely a bet on growth — growth that depends on the subsidy continuing long enough to lock in the market.

The Early Tremors

The subsidy isn't ending all at once. It's eroding at the edges, and the erosion patterns are identical to what happened in ride-hailing.

Anthropic's quota rationing — which we covered in The Rationing — is the most visible sign. Claude Pro and Max users started hitting session limits faster during peak hours in early 2026, with no announcement and initial denials from support. The quota is functionally a price increase expressed in time rather than dollars: you pay the same amount but get less inference. For power users accustomed to all-day access, the effective cost per useful hour of AI assistance has already risen.

Copilot's metered billing is the most honest version of the same trend. Microsoft stopped pretending unlimited inference at $10/month was sustainable and put a number on every request over the cap. The number — four cents — looks small until you calculate what a productive day of AI-assisted coding actually consumes.

Cursor has raised prices while quietly reducing what each tier includes. Windsurf's pricing has been chaotic — multiple changes in quick succession, grandfathering promises that weren't honored, a general sense that the company is trying to find a price point that covers costs without triggering a user revolt.

None of these companies are calling this what it is. Anthropic calls it "optimizing the experience for all users." Microsoft calls it "flexible pricing." Cursor calls it nothing at all; they just update the pricing page. The language is different, but the pattern is the same: the subsidy is being withdrawn in increments small enough that no single change triggers mass defection, large enough that the cumulative effect is substantial.

One Axios analysis from March 2026 projected that agentic AI subscriptions like Claude Code and Copilot will increase 10x to 100x from their January 2026 levels by the end of 2027.10 If that projection holds even partially, a developer paying $200/month today is looking at $2,000/month within two years. Or $20,000. The range is wide because nobody — including the providers — knows what inference will actually cost at scale once the subsidy is gone.

The Dependency Trap

The subsidy's real cost isn't measured in dollars. It's measured in the decisions developers and organizations are making right now, under the assumption that current pricing is durable.

Teams are hiring differently. If AI can handle the boilerplate, you need fewer junior developers and more senior architects who can review and direct AI output. Entire onboarding programs have been restructured around the assumption that new hires will use AI pair programming from day one. Codebases are growing faster because AI makes it easy to generate code but doesn't make it proportionally easier to maintain it. Companies are setting headcount targets based on per-developer productivity that includes AI assistance baked in.

These decisions are structural. They take months or years to reverse. If AI tool costs triple — let alone increase 10x — the response isn't "go back to how things were." The junior developers weren't hired. The institutional knowledge that existed in manual processes wasn't preserved. The codebase grew to a size that assumed AI-assisted maintenance. Reverting to pre-AI workflows isn't flipping a switch; it's a reorganization.

This is what makes the Uber parallel precise rather than approximate. Uber's subsidy didn't just give people cheap rides. It changed how they organized their lives: where they chose to live, whether they owned a car, how they commuted. When rides doubled in price, the users who had sold their cars or moved to transit-poor neighborhoods on the assumption of cheap Uber were stuck. The switching cost wasn't the price of a competitor's app. It was the price of undoing years of decisions made under subsidized conditions.

AI coding tools are creating the same lock-in, but at the organizational level. A company that restructures its engineering org around AI-augmented productivity can't easily un-restructure it when the bill comes due.

The Declining Cost Argument

The bull case against this analysis is hardware. Inference costs have fallen roughly 10x annually since 2022: a GPT-4-equivalent query that cost $20 per million tokens in late 2022 costs about $0.40 per million tokens today.2 If that trajectory holds, the subsidy closes itself. Prices don't need to rise because costs fall to meet them.

This is plausible but incomplete. The cost declines are real, but they're being offset by three countervailing forces. First, model capabilities are increasing — each generation is more expensive to run than the last, partially consuming the hardware efficiency gains. Second, usage patterns are expanding: developers who started with occasional autocomplete suggestions now run multi-turn agentic sessions that consume orders of magnitude more compute. Third, competition forces providers to offer the latest, most expensive models at every tier, because users compare capabilities across providers.

The net result: inference costs per token are falling, but inference consumption per user is rising faster. Anthropic's break-even projection of 2028 already accounts for continued cost declines. The fact that it's still two years out, with optimistic assumptions, tells you how deep the hole is.

The Open-Weight Escape Hatch

There is one structural hedge against the subsidy trap: open-weight models running on local or self-hosted hardware. Meta's Llama, Alibaba's Qwen, Zhipu's GLM, Moonshot's Kimi — models that can be downloaded and run without a subscription, without metering, without any dependency on a provider's pricing decisions.

The trade-off is real. The best open-weight models are roughly one generation behind the frontier closed models on coding tasks. A local setup capable of running a 70B-parameter model at reasonable speed requires hardware that costs $2,000-5,000 upfront. The models need more manual configuration. They lack the polished tool integrations of Claude Code or Copilot.

But the gap is closing. And for organizations thinking about risk management rather than peak performance, the calculus is straightforward: a one-time hardware investment with no recurring subscription is a fundamentally different cost structure than an inference bill that someone else controls. If the 10x-to-100x projection materializes, self-hosted inference stops being a hobbyist pursuit and becomes a business continuity strategy.

The Interest Rate

The AI productivity revolution is real. Developers are measurably faster with these tools. Code is being written, shipped, and iterated at a pace that wasn't possible three years ago. None of that is in dispute.

What's in dispute is the price. The productivity gains developers are experiencing today are denominated in subsidized dollars. The workflows, the team structures, the hiring plans, the codebase sizes, the sprint velocity targets — all of it is calibrated to a cost of AI assistance that does not reflect reality. When Anthropic is spending $51,000 to serve a user paying $200, the difference isn't efficiency. It's a loan.

The loan will come due. It might come due gradually, through metered billing and shrinking quotas and tier restructuring — the Copilot path. It might come due suddenly, if a major provider hits a funding wall or a market correction reprices AI company valuations. It might be softened by continued hardware improvements that narrow the gap between price and cost. But the gap is too large to close without prices rising, costs falling dramatically, or both.

The productivity gains that depend on below-cost pricing aren't productivity gains. They're loans against a future where someone — the developer, the employer, or the venture capitalist — pays the real cost of inference. The gains are real. The price is not. And the interest rate hasn't been set yet.

Citations

  1. Anthropic Claude Max compute costs and the $51,291 single-user case. checkthat.ai; screenapp.io (2026).
  2. OpenAI financials: $13.1B revenue, $8B loss, $14B projected burn in 2026; inference cost decline data. introl.com; aiautomationglobal.com (2025-2026).
  3. Anthropic margin trajectory (-94% to +40%), break-even 2028 projection. ai2.work (2025-2026).
  4. Cursor $2B annualized revenue and $3B VC raise. winsomemarketing.com (March 2026).
  5. GitHub Copilot tiered pricing: Pro ($10/300 requests), Pro+ ($39/1,500 requests), $0.04/request overage. github.com; docs.github.com (2026).
  6. Uber + Lyft combined $25B+ in subsidies; 50-100% fare increase 2019-2022. Slate; ABC News (2022-2023).
  7. Uber first adjusted EBITDA profit Q2 2021. Entrepreneur (2021).
  8. Average ride-hail fares up 9.6% YoY ($21.58 to $23.66). Entrepreneur (2025).
  9. Anthropic $380B valuation, $14B annualized revenue, $64B total raised. CNBC; Bloomberg (February 2026).
  10. Agentic AI subscription costs projected to increase 10x-100x by end of 2027. Axios (March 12, 2026).
Share: Bluesky · Email
Get sloppish in your inbox
Free newsletter. No spam. Unsubscribe anytime.