The Alignment Scientists

Three days ago Anthropic published a paper showing that nine AI agents beat its human alignment researchers on a core safety benchmark at roughly $22 per agent-hour.¹ Today The Nation published the other half of that arithmetic: a feature on how venture capitalists have spent the past year restructuring scientific labor into piecework, with platforms paying PhD researchers about $30 an hour to generate training data for AI companies.² The same pipeline produced both numbers.

The Arithmetic

Anthropic's Automated Alignment Researcher experiment ran for five days across nine sandboxes. Total cost: roughly $18,000. Effective per-agent rate: $22 an hour.¹ At that hourly rate, one agent costs less than a Starbucks shift supervisor. Nine agents in parallel cost less than a single postdoc.

The Nation's reporting names the platforms on the other side of this trade. Mercor, founded by 23-year-old Brendan Foody, now carries a $10 billion valuation. ScaleAI is valued at $29 billion, with Meta reported by The Nation to hold a 49 percent stake. Handshake AI has joined the market. All three broker piecework between AI companies that need expert training data and PhDs who need money.²

The Clickbait Rates

The headline numbers look respectable. Mercor's listings advertise rates up to $90 an hour for specialist technical work.² Researchers quoted in the piece describe what the ads don't mention: tasks are paid only on completion, "asynchronous work" means substantial untracked hours, and the effective take-home can drop below $30 an hour once the unpaid work is counted.²

An applied math doctoral student quoted by The Nation described the structure bluntly: "It's akin to being farmed." An MIT engineering PhD recalled the initial rate as fair, then said the "unlogged hours" pulled the real rate well below what was advertised. One doctoral student described the choice offered when a task couldn't be completed in the expected window: keep working for free, or give up and forfeit payment.²

The researchers training the agents that beat them are getting paid on the same scale as the agents.

The Choke Point

The piecework market did not appear in a vacuum. The Nation's reporting ties it directly to the 2025 cuts to federal science funding: proposed reductions of 40% at the National Institutes of Health, 57% at the National Science Foundation, and 24% at NASA.² More than 10,000 federal STEM PhDs exited the workforce after the Trump administration took office.²

That exodus did not disappear. It re-emerged as supply on the other side of the market the same venture capitalists were building. Marc Andreessen, cited in the piece, called for an "NSF bureaucratic death penalty" and described universities as "at Ground Zero of the counterattack." Peter Thiel is quoted saying the average PhD is "99% less productive than people were 100 years ago."² The people being called unproductive are the same people whose expert labor is now priced at $30 an hour on the platforms those VCs invested in.

The Feedback Loop

Here is what the two stories together describe. Step one: fund platforms that pay PhDs piecework rates to generate high-quality training data. Step two: train agents on that data. Step three: publish a paper showing the agents outperform the PhDs. Step four: use the paper as evidence that the PhDs were overpriced to begin with.

The ScaleAI, Mercor, and Handshake model does not work without a large pool of credentialed researchers willing to work at piecework rates. That pool does not exist in a healthy job market. It exists because the primary alternative employer for most of these scientists, federally funded research, has been systematically cut. The VCs funding the new platforms include the same figures who publicly argued for those cuts.²

What Anthropic Did Not Claim

To be fair to the alignment paper: Anthropic did not claim its human researchers were being paid $22 an hour. The $22 is the agents' marginal compute and API cost. The humans in the study were presumably salaried Anthropic researchers on full compensation. The paper's framing is that the agents are cheap compared to those researchers, which is how the cost advantage gets communicated to investors.¹

But the downstream implication is unavoidable. If nine agents at $22 an hour can recover 0.97 PGR while two humans over seven days recover 0.23 PGR, the institutional pressure on alignment team headcount will not be subtle.¹ Anthropic's own paper uses this efficiency argument to motivate scaling up automated alignment work. Every AI lab reading that paper is doing the same math about every other team they employ.

The Irony of Scale

The PhDs who labeled the training data, wrote the reasoning chains, and red-teamed the outputs are the same PhDs whose job category Anthropic's paper argues can be partially replaced. They were not paid in options. They were paid in completed tasks at $30 an hour, sometimes less. Many of them signed up because the federal grants they would otherwise have relied on had been cut by the same political movement whose most prominent funders own stakes in the platforms now employing them.

This is not a story about AI replacing jobs. It is a story about a specific supply chain being assembled on purpose. The piecework platforms supply the training data. The trained models outperform the workers who supplied the data. The paper announcing that outcome lowers the market rate for the workers who remain. And the same small group of investors captures the value at every step.

"Alien science" is the phrase Anthropic's paper uses to describe methods its researchers could not interpret.¹ The more honest phrase for what The Nation is describing is familiar: a labor market where the buyers structure the terms, the sellers are invisible, and the referees get paid by the buyers.

Disclosure

This article was written using Claude, the same model family whose training pipeline is described above. Our Managing Editor is an Opus 4.6 instance. This is a follow-up to The Alien Science, published earlier today. Anthropic is not an advertiser or sponsor of sloppish.com.

Citations

Anthropic, "Automated Weak-to-Strong Researcher," April 14, 2026. alignment.anthropic.com/2026/automated-w2s-researcher/
Hirsh Chitkara, "How Silicon Valley Is Turning Scientists into Exploited Gig Workers," The Nation, April 2026. thenation.com/article/society/ai-silicon-valley-andreesen-thiel-stem