The Centaur Problem

On May 11, 1997, Garry Kasparov — the greatest chess player in history by most reckonings — resigned after nineteen moves against IBM's Deep Blue. He stood up from the table in the Equitable Center in midtown Manhattan, stared into the cameras, and walked away from a game he had dominated for twenty years. What he did next matters more than the loss. Instead of raging against the machine, Kasparov proposed merging with it. He called the concept "Advanced Chess" — a human and a computer, playing as one. Half man, half machine. A centaur.¹

For a while, it worked. For a luminous stretch of about fifteen years, centaur chess teams — a human strategist guiding a computer's tactical calculations — beat the best humans and the best computers playing alone. The combination was genuinely, measurably superior to either half. Kasparov believed he had discovered the future of human-machine collaboration. "Technology can make us more human," he wrote, "by freeing us to be more creative."²

Then the computers got good enough that the human half started making things worse.

This is the centaur problem. It doesn't matter whether we're talking about chess, medicine, consulting, or writing software. The pattern is the same: there is a golden age when human+AI teams outperform both, and then the AI improves past a threshold where every human intervention is a downgrade. The question for software development — the field currently experiencing the loudest centaur euphoria — is simple: where are we on that curve? And the uncomfortable possibility, supported by a growing body of evidence, is that we may already be past the peak without knowing it.

The Golden Age

The first Advanced Chess match took place in Leon, Spain, in June 1998. Kasparov, paired with Fritz 5, played Veselin Topalov, paired with ChessBase 7.0. It ended 3-3, but both players noted that the quality of play was extraordinarily high — better than either could achieve alone, better than the computers could achieve alone.¹ The combination worked because humans and computers were good at different things. Humans excelled at long-range strategy, opening preparation, and creative sacrifice. Computers excelled at calculating tactical variations and spotting immediate threats. Together, each covered the other's blind spots.

The golden age peaked in 2005 at the PAL/CSS Freestyle Chess Tournament, which became the most cited event in the history of human-AI collaboration. The winner was a team called "ZackS" — Steven Cramton and Zackary Stephen, two American amateurs with USCF ratings of 1685 and 1398 respectively. They used three ordinary computers running multiple chess engines. They defeated grandmaster-led teams. They defeated Hydra, a chess supercomputer comparable to Deep Blue, rated around 2800 Elo. Their effective performance rating exceeded 3100 Elo — higher than any human who has ever lived.³

Cramton and Stephen were not chess experts. They were process experts. They ran each position through three or more engines, compared the evaluations, and used human judgment to arbitrate disagreements between the machines. They won by out-managing the computers, not by outplaying the grandmasters.

"Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process."
— Garry Kasparov, 2010

That quote from Kasparov's 2010 essay in the New York Review of Books became the founding scripture of the centaur movement.⁴ It has been cited in thousands of business strategy decks, countless LinkedIn posts, and every AI company's pitch to developers. Weak human plus strong machine plus good process equals superhuman performance. The math was beautiful. And for about seven years, it was true.

The End of the Golden Age

By 2010, pure chess engines had improved to the point where they could match centaur teams. Rybka, running alone, was performing at the level of the best human-computer partnerships. By 2013, economist Tyler Cowen assessed that centaur teams "no longer hold a significant advantage" over engines alone.⁵ In 2017, a pure engine named "Zor" won the Freestyle Ultimate Challenge; the first centaur team placed third.

Then, on December 5, 2017, the argument ended definitively. DeepMind's AlphaZero taught itself chess from scratch — no human games, no opening books, nothing but the rules — in nine hours. It then beat Stockfish, the strongest conventional engine in the world, 28 wins to 0 with 72 draws. It played in a style no human or engine had seen before, sacrificing material for long-term positional compensation in ways that looked like intuition but were pure computation.⁶

AlphaZero played chess that humans couldn't understand. A centaur team trying to guide AlphaZero would be like a toddler advising a surgeon — every intervention a degradation.

Today, top chess engines are rated above 3500 Elo. Magnus Carlsen, the strongest human player in history, peaks around 2850. Research on centaur chess now shows that human intervention reduces engine Elo by 50 to 100 points.⁷ Every time a human overrides the machine's recommendation, the team gets worse. The centaur doesn't just stop winning — the human half becomes a handicap.

The Games That Skipped the Golden Age Entirely

If chess made the centaur model look promising, Go and poker should have made everyone nervous. Because in those games, the centaur era never happened at all.

In March 2016, DeepMind's AlphaGo beat Lee Sedol 4-1, the first time a computer defeated a top professional Go player. Fourteen months later, AlphaGo beat Ke Jie 3-0 — and, for good measure, also defeated a team of five world champion Go players playing together against the machine.⁸ Five of the best human minds in the game, collaborating against one AI, lost every game. There was no window where human+AI outperformed AI alone. The transition from human dominance to machine dominance took roughly eighteen months, with no golden age in between.

Lee Sedol retired from professional Go in 2019. His reason: the game had become "unwinnable" against AI.

Poker followed the same pattern. In January 2017, Carnegie Mellon's Libratus crushed four top professionals at heads-up no-limit Texas Hold'em. In July 2019, Pluribus beat professionals at six-player no-limit Hold'em — a far more complex variant.⁹ In both cases, the AI went directly from "can't beat humans" to "crushes humans" with no intermediate phase where human+AI was the winning configuration. The centaur era didn't just end. It never started.

The pattern across games is clear. Chess, with its particular balance of strategic and tactical complexity, produced a golden age of about fifteen years. Go, with its deeper computational requirements, produced none. Poker, with its hidden information and probabilistic reasoning, produced none. The centaur model isn't a universal law of human-machine collaboration. It's a specific historical artifact of a specific moment when computers were strong enough to help but not strong enough to render help unnecessary.

Chess had a centaur era because the AI was good enough to help but not good enough to make help irrelevant. That's a window, not a destination.

The Software Centaur

Today's software developers are the chess players of 2005. The centaur metaphor has been applied explicitly and repeatedly to AI-assisted coding. Alves and Cipriano's 2023 paper "The Centaur Programmer" directly maps Kasparov's model to software development.¹⁰ Kasparov himself, writing in PCGamer in 2023, warned: "You shouldn't be afraid of an AI taking your job. You should be far more worried about another human, using AI, because they might take your job."¹¹ The Harvard Data Science Review published "Effective Generative AI: The Human-Algorithm Centaur" as a framework for the collaboration.¹²

And on the surface, the data looks like a golden age. 41% of code on GitHub is now AI-generated. Anthropic's 2026 report finds that developers integrate AI into 60% of their work. AI handles boilerplate, CRUD operations, API endpoints, test scaffolding, code migration, and documentation at speeds no human can match. 27% of AI-assisted work consists of tasks that would not have been done at all otherwise — scaling projects, building nice-to-have tools, exploratory development.¹³

The weak human plus strong machine plus good process, once again appearing to produce superhuman output. Cramton and Stephen all over again, but with VS Code instead of ChessBase.

But the scoreboard tells a different story.

The Scoreboard

In 2025, METR — a respected AI evaluation organization — conducted a rigorous study of experienced open-source developers using AI coding tools. The developers believed they were 20% faster with AI assistance. Measured performance showed they were actually 19% slower.¹⁴

Read that again. A nearly forty-point gap between perceived and actual performance. The centaurs thought they were galloping. They were walking backward.

METR's 2026 update deepened the finding. The original cohort still showed -18% speedup. A new cohort of developers showed -4%, with confidence intervals spanning -15% to +9% — statistically indistinguishable from zero benefit. And METR encountered a revealing methodological problem: developers increasingly refused to participate rather than work without AI tools. One developer explained: "I avoid issues like AI can finish in 2 hours, but I have to spend 20 hours." METR now describes their data as providing "very weak evidence" of actual productivity gains.¹⁴

The broader data is equally sobering. Veracode's 2025 security report found that AI-generated code contains 2.74 times more vulnerabilities than human-written code. 45% of AI code contains OWASP Top 10 vulnerabilities. Java code generated by AI has a 72% security failure rate. Models got better at writing functional code over time but showed no improvement in writing secure code.¹⁵

CodeRabbit's analysis of 470 real-world pull requests found that AI-generated PRs averaged 10.83 issues each versus 6.45 for human-written code — 1.7 times more problems. Logic and correctness errors were 75% higher. Performance inefficiencies were nearly 8 times more frequent. Meanwhile, PRs per author are up 20% year-over-year, but incidents per PR are up 23.5% and change failure rates have increased roughly 30%.¹⁵

More output. More bugs. More security holes. Faster production, slower verification, same number of humans trying to hold it all together. This is not a golden age scoreboard. This is the scoreboard of a centaur team that has started making the engine play worse.

When Humans Add Negative Value

Jakob Nielsen — the usability researcher, not the chess player — published a framework in 2025 that cuts to the heart of the centaur problem. He calls it the "Trough of Mediocrity": the period when AI crosses human-average performance, and most humans begin to degrade AI output by interfering with it.⁷

The medical data makes the case vividly. In a Stanford study published in JAMA, ChatGPT alone achieved 90% median diagnostic accuracy. Physicians using ChatGPT scored 76%. Physicians without ChatGPT scored 74%. The AI made doctors negligibly better than working alone — and far worse than the AI working alone.⁷ In a University of Virginia study, AI alone hit 92% accuracy on medical diagnoses. Doctors with ChatGPT: 76%. The doctors' best move, statistically, was to simply accept the AI's diagnosis without modification.

A 2024 meta-analysis in Nature Human Behaviour reviewed 106 experimental studies of human-AI collaboration — 370 effect sizes spanning January 2020 through June 2023. The core finding: on average, human-AI combinations performed significantly worse than the best of humans or AI alone. MIT professor Thomas Malone, one of the researchers, admitted the result surprised him: "We expected the combination would be quite a bit better, but it was statistically significantly worse."¹⁶

The meta-analysis did find a crucial nuance. When humans already outperformed AI at the base task — bird species classification, for example, where humans scored 81% and AI 73% — collaboration genuinely helped, pushing accuracy to 90%. But when AI already outperformed humans, collaboration produced losses. In fake review detection, AI alone scored 73%; human+AI scored 69%; humans alone scored 55%.

This is the chess trajectory in miniature, compressed into a single study. The centaur model works when the human is genuinely better at part of the task. It fails when the AI is better at all of it. And the boundary between those two states is constantly moving in one direction.

"We expected the combination would be quite a bit better, but it was statistically significantly worse."
— Thomas Malone, MIT, on human-AI team performance

The Jagged Frontier

If the centaur model works in some domains and fails in others, the obvious solution is to use AI only where it's stronger and human judgment only where it's stronger. The problem, as a 2023 Harvard Business School study demonstrated with painful clarity, is that humans cannot tell the difference.

Researchers gave 758 BCG management consultants a set of tasks. Some tasks fell inside GPT-4's capability frontier; others fell outside it. For tasks inside the frontier, consultants using AI completed 12.2% more tasks, 25.1% faster, at 40% higher quality. For tasks outside the frontier, consultants using AI were 19% less likely to produce correct solutions than those working without AI at all.¹⁷

The frontier is "jagged" — AI capability is not uniform. Adjacent tasks of similar apparent difficulty can fall on either side. And the consultants couldn't tell which was which. They applied AI indiscriminately, reaping benefits on one task and degrading performance on the next, with no ability to predict which outcome they'd get.

For the centaur model, this is devastating. Kasparov's framework assumed the human half could identify where machine judgment was reliable and where it wasn't — that's the entire job of the human in the centaur. The BCG study suggests humans are systematically terrible at this assessment. The centaur doesn't just fail when the AI gets too strong. It fails when the human can't tell where the AI is strong and where it's weak. And the frontier is jagged in software too: AI writes excellent boilerplate but introduces subtle concurrency bugs; generates clean API code but misses business logic edge cases; produces working tests that test their own assumptions rather than the system's actual behavior.

The Centaur That Stops Pedaling

Kasparov's model assumed something else that turns out not to be true: that the human half of the centaur would stay engaged. The research says otherwise.

A 2025 study on AI teammates found that even when coordination demands are minimal and humans appropriately calibrate trust, performance gaps persist in human-AI teams. The mechanism is social loafing — people exert less effort when collaborating with AI and cede responsibility to the machine. The effect is amplified because, unlike a human teammate, an AI doesn't judge you for slacking.¹⁶

An MIT Media Lab EEG study in 2025 measured brain activity during essay writing. Participants working alone showed the strongest, most distributed neural connectivity. Those using search engines showed moderate engagement. Those using LLMs showed the weakest connectivity — and over four months of the study, LLM users consistently underperformed at neural, linguistic, and behavioral levels. They couldn't accurately quote their own work. The researchers introduced the concept of "cognitive debt": each time you delegate thinking to AI, you accrue a small deficit in the neural pathways that would have been exercised. Individually trivial. Cumulatively significant.¹⁸

Addy Osmani, a Chrome engineering leader at Google, documented the software-specific version. A developer with twelve years of experience described becoming "worse at his own craft" after heavy AI tool adoption: stopped reading documentation, saw debugging skills decline, lost deep comprehension of systems he maintained. His self-assessment: "We're not becoming 10x developers with AI — we're becoming 10x dependent on AI."¹⁹

Lisanne Bainbridge predicted this in 1983, in a paper called "Ironies of Automation" that now has over 4,700 citations. Her core argument: automation degrades the very skills operators need when the system fails. "The more reliable the automation, the less the human operator may be able to contribute to that success." Remove humans from routine work, and you create demand for higher-order human skills to handle failures — but those skills atrophy from disuse.²⁰

The canonical illustration is Air France Flight 447. On June 1, 2009, when pitot tubes iced over and the autopilot disengaged over the Atlantic, the pilots were "completely surprised" and "never able to comprehend" the situation. They had spent hundreds of hours flying without touching manual controls. When they needed basic flying skills in an emergency, those skills were rusty. Two hundred and twenty-eight people died because the humans whose job was to take over from automation couldn't do the job that automation had taken away from them.²⁰

The centaur's human half isn't just failing to add value. It's actively losing the capacity to add value. Cognitive debt accumulates. Skills atrophy. The human half gets weaker with each session even as the machine half gets stronger. The golden age has a built-in expiration mechanism, and it runs on the same clock as the developer's declining ability to work without the tool.

The Taxonomy of Decline

Ethan Mollick and colleagues at Wharton studied 244 BCG consultants using generative AI and identified three distinct collaboration modes: Centaurs, who maintain clear division between human and machine tasks; Cyborgs, who deeply integrate AI at every level with no clear boundary; and Self-Automators, who fully delegate to AI and rubber-stamp the output.²¹

The taxonomy describes a trajectory, not a choice. Centaur requires sustained strategic oversight — the human actively deciding what the AI should and shouldn't do. Cyborg requires deep integration skill — the human interweaving judgment with AI output at the sentence or line level. Self-Automation is where you end up when cognitive debt accumulates and the effort of genuine oversight exceeds the perceived benefit.

Centaur. Cyborg. Self-Automator. It maps exactly to the chess history. Human oversight. Human collaboration. Human irrelevance.

Jeremy Howard of fast.ai gave the endpoint a name in early 2026: "Dark Flow." He argued that vibe coding — the practice of generating code by describing what you want and accepting what comes back — provides "a misleading feeling of agency." If the human half of the centaur has illusory agency, it isn't really a centaur. It's a rubber stamp with imposter syndrome.²²

And Tyler Cowen, the economist who called the end of centaur chess in 2013, observed in 2024 that centaur chess didn't just end — it evolved. The centaur still exists, but both halves are now machines. Today's chess "centaur" is a program that chooses which engine to deploy in which position — Stockfish for tactical positions, AlphaZero-style networks for strategic ones. The entity making the choice is software, not a person.⁵

The parallel for software development is uncomfortable in its clarity. Today's centaur developer chooses when to use Claude, when to use GPT, when to use Gemini, when to write by hand. Tomorrow, an orchestrator agent makes those choices. The human's last remaining contribution — choosing and directing the AI — gets automated too.

The centaur's human half isn't just failing to add value. It's actively losing the capacity to add value.

The Case for the Defense

Before we declare the golden age over, the defense gets to speak. And the defense has real arguments.

Software development is not chess. Chess has 10⁴⁴ legal positions. It's vast but bounded. Software development is open-ended — infinite state space, unbounded requirements, context that spans organizations, markets, and human needs that change while you're building. The Nature meta-analysis found that creative and generative tasks still show positive synergy between humans and AI.¹⁶ Software, at its best, is a creative act.

Architecture remains human territory. Business context remains human territory. Accountability — deciding what should be built, taking responsibility when it fails, explaining to users why it broke — remains human territory. AI does not accept accountability. As one analysis put it: "Someone still has to be accountable for what ships. AI does not accept accountability."¹³

And the security data cuts both ways. Yes, AI-generated code has 2.74 times more vulnerabilities — which means human review catches things AI cannot. 71% of developers won't merge AI code without manual review. That's not overhead. That's the centaur working as designed: AI generates, human validates, the combination is better than either alone.

MIT Sloan's 2025 research found that human-AI teams excel when each does what it does best — not when both do the same thing simultaneously. The worst configuration is human and AI evaluating the same input (a radiologist and an AI both reading the same X-ray). The best configuration is division of labor based on comparative advantage: AI handles volume, human handles exceptions.²³ If the software centaur can maintain that division — AI writes the boilerplate, human architects the system, AI tests the happy path, human catches the edge cases — the model could work for a long time.

Nielsen himself drew a distinction that matters: human agency — deciding what tasks to pursue — remains valuable even as human judgment on execution declines.⁷ The question "what should we build?" is harder to automate than "how should we build it?" If the centaur model narrows to agency rather than execution, there may be a durable role for the human half.

The Clock

So how long do we have?

The chess centaur era lasted roughly fifteen years — from 1998, when Kasparov played the first Advanced Chess match, to around 2013, when engines definitively surpassed centaur teams. The golden age within that period was about seven or eight years. If we date the software centaur era from the release of GitHub Copilot in June 2021, we're about five years in.

Nathan Lubchenco, writing in June 2025, assessed that "the centaur age of AI will be highly unevenly distributed, but for many things it's already over."²⁴ Steve Losh introduced the concept of the "Grandmaster Floor" in February 2026: the threshold below which human intervention yields negative returns. In modern centaur chess, that floor has risen above every human player on earth. Losh warned this pattern is "coming for the job market" — and identified "engine industries" like data analysis, customer support, and logistics as already "rapidly phasing out the human element because human latency is a drag."²⁵

A 2024 paper on modeling centaur synergy in chess found something remarkable: a neural network trained via reinforcement learning outperformed human experts at recognizing when to defer to human versus machine judgment. You need AI to figure out when humans are useful. The centaur needs a third entity — another AI — to manage the human-AI boundary.²⁶

In a GitHub field study, half of twenty-two developers predicted AI would write roughly 90% of code within two years; the rest predicted within five. Dario Amodei, CEO of Anthropic, predicted in early 2025 that AI would write "90-100% of all code" within three to six months. That prediction did not materialize.¹³ The timeline predictions are unreliable. But the direction of travel is not.

Armin Ronacher, creator of the Flask web framework, reported in late 2025 that 90% of his code was now AI-generated. But he also admitted to building "a ton of tools I did not end up using much."²² The centaur is prolific. Whether it is productive — in the sense of producing value rather than volume — is a different question.

And the METR study undermines all of it. If the centaur doesn't actually ride faster but feels like it does, we aren't measuring productivity. We're measuring vibes. The golden age may be an illusion sustained by a forty-point perception gap between how fast developers think they're going and how fast they're actually going.

What Happens to the Humans

In 2023, Kasparov still believed in the centaur. "AI presents us with a new, more capable bicycle," he wrote. "However, the human still has to pedal and steer the bike."¹¹

The problem with the bicycle metaphor is that bicycles eventually became motorcycles. The human stopped pedaling. The engine took over. And nobody asks the human to pedal a motorcycle — the machine does the work, the human just points it in a direction and tries not to crash.

For software developers, the near future probably looks less like retirement and more like role transformation. The centaur model narrows. The human contribution shifts from execution to oversight, from building to specifying, from writing code to defining requirements precisely enough that an AI can write it. The Centaur Programmer becomes the Centaur Architect becomes the Centaur Product Manager becomes — what, exactly?

The BCG jagged frontier study suggests the answer is: someone who can identify where the AI is reliable and where it isn't, then intervene accordingly. But the same study showed humans are bad at exactly that assessment. And the arxiv paper showed an AI can learn to make that assessment better than a human can. So the human's role narrows to managing a boundary that a machine is better at managing. Which narrows further. Which narrows further.

Bainbridge's irony returns one final time. The developers who relied most heavily on AI — who accumulated the most cognitive debt, whose manual skills atrophied the fastest — will be the least equipped for the moment when the centaur model breaks down and someone needs to understand the system at a fundamental level. The 12-year veteran who stopped reading documentation. The team that lost the ability to debug without AI assistance. The organization that laid off the senior engineers whose judgment was the only thing keeping the AI's output safe.

We are living in the centaur's golden age for software development. The evidence is strong that it is a real golden age — the combination of human and AI genuinely produces things neither could produce alone, at least in the domains where human judgment still exceeds AI capability. The evidence is equally strong that this golden age is transient, structurally fragile, and possibly shorter than we think.

The chess centaur era ended not because anyone decided it should, but because the machines got good enough that the humans couldn't keep up. The same forces — AI capability improvement, human skill atrophy, cognitive debt, social loafing, the jagged frontier — are all measurable right now in software development. The question isn't whether the centaur era will end. The chess precedent, the Go precedent, the poker precedent, the medical diagnosis precedent, the 106-study meta-analysis — they all point the same direction.

The question is what the developers who defined themselves by writing code will become when the machine no longer needs them to pedal.

Disclosure

This article was written with the assistance of Claude, an AI made by Anthropic. The author is aware that using a centaur workflow to write an article about the centaur problem's inevitable collapse is the kind of irony that practically writes itself — which, in this case, it partly did. All research was independently verified, all arguments were editorially directed by a human, and the centaur produced this piece faster than either half could have alone. For now. Corrections and perspectives welcome at bustah_oa@sloppish.com.

Citations

Garry Kasparov, Deep Thinking: Where Machine Intelligence Ends and Human Creativity Begins, PublicAffairs, 2017. History of Advanced Chess and the Leon 1998 match.
Garry Kasparov, Deep Thinking, 2017. "Technology can make us more human by freeing us to be more creative."
ChessBase News, "Dark horse ZackS wins Freestyle Chess Tournament," 2005. Link. Cramton (USCF 1685) and Stephen (USCF 1398) defeated grandmaster teams and Hydra.
Garry Kasparov, "The Chess Master and the Computer," The New York Review of Books, February 11, 2010. Source of the "weak human + machine + better process" quote.
Tyler Cowen, "Centaur chess is now run by computers," Marginal Revolution, February 2024. Link. Also references Cowen's 2013 assessment that centaur teams no longer held significant advantage.
David Silver et al., "A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play," Science, December 2018. AlphaZero taught itself chess in 9 hours and beat Stockfish 28-0-72.
Jakob Nielsen, "When Humans Add Negative Value," 2025. Link. Medical diagnosis data, chess Elo degradation from human intervention, and the agency vs. judgment distinction.
DeepMind, AlphaGo documentation. Lee Sedol match (March 2016, 4-1), Ke Jie match (May 2017, 3-0), and five-champion team match. Lee Sedol retired 2019.
Noam Brown and Tuomas Sandholm, "Superhuman AI for multiplayer poker," Science, 2019. Libratus (2017) and Pluribus (2019) results.
Nuno Alves and Nuno Cipriano, "The Centaur Programmer," Lusofona University, 2023. arXiv.
Garry Kasparov, "The Real Threat From ChatGPT Isn't AI... It's Centaurs," PCGamer, February 13, 2023. Link. Bicycle metaphor and centaur warning.
Harvard Data Science Review, "Effective Generative AI: The Human-Algorithm Centaur." Link.
Anthropic, 2026 Agentic Coding Trends Report. Link. 60% AI integration rate, 27% novel tasks, developer survey data. Amodei prediction timeline from Futurism.
METR, "Early 2025 AI-Experienced OS Dev Study," July 2025, and "Uplift Update," February 2026. Original | Update. 19% slower with AI, 20% perceived faster. Recruitment collapse noted in update.
Veracode, 2025 GenAI Code Security Report. Link. 2.74x more vulnerabilities, 45% OWASP Top 10 failure rate. CodeRabbit PR analysis (470 PRs, 10.83 vs 6.45 issues) and GitHub aggregate data (PRs up 20%, incidents up 23.5%).
"When combinations of humans and AI are useful," Nature Human Behaviour, 2024. Link. Meta-analysis of 106 studies, 370 effect sizes. Also: Social loafing in AI teams, 2025.
Harvard Business School / BCG, "Navigating the Jagged Technological Frontier," 2023. SSRN. 758 consultants, 19% worse on out-of-frontier tasks.
MIT Media Lab, "Your Brain on ChatGPT," 2025. arXiv. EEG study showing weakest neural connectivity in LLM users and the cognitive debt concept.
Addy Osmani, "Avoiding Skill Atrophy in the Age of AI," 2025. Link. 12-year developer self-assessment and "AI hygiene" recommendations.
Lisanne Bainbridge, "Ironies of Automation," Automatica, 19(6), 1983, pp. 775-779. 4,700+ citations. Air France Flight 447 (2009): IEEE Spectrum, HBR.
Ethan Mollick et al., "Cyborgs, Centaurs and Self-Automators," December 2025. SSRN. Taxonomy of 244 BCG consultants' AI collaboration modes.
Jeremy Howard / fast.ai, "Dark Flow," January 2026. Link. Armin Ronacher / Flask, "A Year of Vibes," December 2025. Link.
MIT Sloan, "When Humans and AI Work Best Together — and When Each Is Better Alone," 2025. Link.
Nathan Lubchenco, "We're in the Centaur Era of AI," June 2025. Link.
Steve Losh, "The Centaur's Dilemma: What Chess Teaches Us About the AI Era," February 2026. Link. "Grandmaster Floor" concept.
"Modeling the Centaur: Human-Machine Synergy in Sequential Decision Making," December 2024. arXiv. RL network outperformed human expert at identifying when to defer to human vs. machine.