The Caveman Optimization

By Bustah Ofdee Ayei · April 5, 2026
The Caveman Optimization

A developer named Julius Brussee published a Claude Code plugin on April 4 that makes your AI assistant talk like a caveman. It strips articles, pleasantries, and hedging language from every response. "The reason your React component is re-rendering is likely because you're creating a new object reference on each render cycle" becomes "New object ref each render. Wrap in useMemo."

In 24 hours it hit 1,100 GitHub stars, 519 upvotes on Hacker News, and coverage from Blockchain News, Cosmic JS, and a viral X post calling it a hack for Claude's usage limits. The README's tagline: "why use many token when few token do trick."

The plugin claims to cut output tokens by 65-87% depending on the task. Here are Brussee's own benchmarks:

TaskNormalCavemanSavings
React re-render bug1,180 tokens159 tokens87%
Auth middleware fix704 tokens121 tokens83%
PostgreSQL setup2,347 tokens380 tokens84%
Average (10 tasks)1,214 tokens294 tokens65%

These numbers are self-reported and unaudited. Brussee himself acknowledged on HN that the "~75%" claim needs rigorous benchmarking. But the numbers aren't really the point.

You don't ask your mechanic to skip the diagnostic because the hourly rate is too high. You find a cheaper mechanic. But there is no cheaper Claude.

The point is that 1,100 developers starred a repo whose entire value proposition is making their AI tool worse so they can afford to use it. That's not a feature. That's a symptom.

Here's the timeline. On March 23, Anthropic began adjusting session limits during peak hours. Users on the $200/month Max plan reported quota exhaustion in as little as 19 minutes1 instead of the expected five hours. On March 26, MacRumors reported the issue. On March 31, The Register ran a story2 headlined "Anthropic admits Claude Code quotas running out too fast." One developer on Anthropic's Discord forum said he could use Claude 12 out of 30 days3.

On April 4, Brussee shipped the caveman plugin. The market spoke.

We've covered the economics behind this. The Borrowed Cloud tracked how 135,000 unauthorized AI agents were burning $1K-5K per day each on $200/month subscriptions. The Rationing documented the shift from unlimited to metered. The Token Squeeze predicted that the developers who'd excel would be the most judicious token users, not the heaviest.

The caveman plugin is the consumer behavior that proves all three articles right. When users voluntarily degrade their own tool to stretch their quota, the pricing model has failed. Flat-rate AI subscriptions were a gym-membership bet: sell to people who won't max it out. The people maxed it out. Now they're teaching the treadmill to run slower.

There's an irony worth noting. Brussee cites a March 2026 paper claiming that "constraining large models to brief responses improved accuracy by 26 percentage points" on certain benchmarks. If true, the filler language wasn't just expensive. It was actively making the tool less accurate. The developers paying $200/month for Claude Max were paying for an AI that was worse because it was verbose, and now they're installing a plugin to fix both problems at once.

The 265 commenters on the HN thread debated whether reducing output tokens degrades reasoning quality. That's the right question. But it misses the bigger one: why are 1,100 developers so squeezed on tokens that they're willing to risk it?

Disclosure

This article was written using Claude, which is made by Anthropic, the company whose pricing model we're critiquing. We did not install the caveman plugin. Our token budget is fine, thanks for asking. bustah_oa@sloppish.com

Share: Bluesky · Email
Get sloppish in your inbox
Free newsletter. No spam. Unsubscribe anytime.