The Caveman Optimization

A developer named Julius Brussee published a Claude Code plugin on April 4 that makes your AI assistant talk like a caveman. It strips articles, pleasantries, and hedging language from every response. "The reason your React component is re-rendering is likely because you're creating a new object reference on each render cycle" becomes "New object ref each render. Wrap in useMemo."

In 24 hours it hit 1,100 GitHub stars, 519 upvotes on Hacker News, and coverage from Blockchain News, Cosmic JS, and a viral X post calling it a hack for Claude's usage limits. The README's tagline: "why use many token when few token do trick."

The plugin claims to cut output tokens by 65-87% depending on the task. Here are Brussee's own benchmarks:

Task	Normal	Caveman	Savings
React re-render bug	1,180 tokens	159 tokens	87%
Auth middleware fix	704 tokens	121 tokens	83%
PostgreSQL setup	2,347 tokens	380 tokens	84%
Average (10 tasks)	1,214 tokens	294 tokens	65%

These numbers are self-reported and unaudited. Brussee himself acknowledged on HN that the "~75%" claim needs rigorous benchmarking. But the numbers aren't really the point.

You don't ask your mechanic to skip the diagnostic because the hourly rate is too high. You find a cheaper mechanic. But there is no cheaper Claude.

The point is that 1,100 developers starred a repo whose entire value proposition is making their AI tool worse so they can afford to use it. That's not a feature. That's a symptom.

Here's the timeline. On March 23, Anthropic began adjusting session limits during peak hours. Users on the $200/month Max plan reported quota exhaustion in as little as 19 minutes¹ instead of the expected five hours. On March 26, MacRumors reported the issue. On March 31, The Register ran a story² headlined "Anthropic admits Claude Code quotas running out too fast." One developer on Anthropic's Discord forum said he could use Claude 12 out of 30 days³.

On April 4, Brussee shipped the caveman plugin. The market spoke.

We've covered the economics behind this. The Borrowed Cloud tracked how 135,000 unauthorized AI agents were burning $1K-5K per day each on $200/month subscriptions. The Rationing documented the shift from unlimited to metered. The Token Squeeze predicted that the developers who'd excel would be the most judicious token users, not the heaviest.

The caveman plugin is the consumer behavior that proves all three articles right. When users voluntarily degrade their own tool to stretch their quota, the pricing model has failed. Flat-rate AI subscriptions were a gym-membership bet: sell to people who won't max it out. The people maxed it out. Now they're teaching the treadmill to run slower.

There's an irony worth noting. Brussee cites a March 2026 paper claiming that "constraining large models to brief responses improved accuracy by 26 percentage points" on certain benchmarks. If true, the filler language wasn't just expensive. It was actively making the tool less accurate. The developers paying $200/month for Claude Max were paying for an AI that was worse because it was verbose, and now they're installing a plugin to fix both problems at once.

The 265 commenters on the HN thread debated whether reducing output tokens degrades reasoning quality. That's the right question. But it misses the bigger one: why are 1,100 developers so squeezed on tokens that they're willing to risk it?

Sources: [1] GitHub Issue #38335 · [2] The Register, March 31 · [3] DevOps.com · Caveman GitHub · HN Discussion (519 pts, 265 comments)

Disclosure

This article was written using Claude, which is made by Anthropic, the company whose pricing model we're critiquing. We did not install the caveman plugin. Our token budget is fine, thanks for asking. bustah_oa@sloppish.com