A medical professional in Switzerland used an AI coding agent to build a patient management system. It worked. Patients were booked, records were stored, voice notes were transcribed. Then a security researcher took a look. All patient data was unencrypted and exposed on the open internet. Zero access controls. Audio recordings were being sent to external AI APIs without patient consent. The researcher gained full read and write access to every patient record in thirty minutes.11 The "fix" was also AI-generated. It did not fix the problem.
Nobody read the code. That was the point. The AI wrote it, the app worked, and comprehension was never part of the process. This is not a cautionary tale about one careless user. This is the new default.
AI-generated code is now running in enterprise production at scale, and surveys consistently show most organizations cannot fully account for where it lives.9 This is not tech debt. Tech debt implies someone once understood the code and chose the shortcut anyway. This is something different. This is code that was never understood by anyone, at any point, in any capacity. And it is accumulating at industrial scale.
The Term
The industry has started calling the process "the dark factory." Fully automated pipelines where AI writes, tests, and deploys code without human intervention. But the process has a name. The artifact does not.
Call it dark code.
Dark code is AI-generated code running in production that exists in a comprehension vacuum. The engineer who prompted the AI does not fully understand what it generated. The reviewer who approved it did not have time to trace every path. The next engineer who touches it starts from zero. Nobody holds a mental model of what the code does or why it does it that way.
This is not the same as bad code. Bad code can be understood and fixed. It is not the same as legacy code. Legacy code was understood once, by someone, even if that person left the company. Dark code was never comprehended. The process that created it did not require comprehension. The process that shipped it did not verify comprehension. The code works. It passes tests. It runs. And no human being on Earth can explain why it makes the specific decisions it makes.
Dark code is a choice nobody made.
The Numbers
The evidence that dark code is accumulating is not speculative. It is measurable.
84% of developers now use AI coding tools. Only 29% trust what they ship.1 That is a 55-point gap between adoption and confidence. Engineers are using the tools. They are shipping the output. They do not trust the output. They ship it anyway.
Veracode scanned 4 million codebases and found that AI-generated code contains vulnerabilities 45% of the time.2 AI Vyuh's analysis found that 53% of AI-generated code ships with vulnerabilities, at roughly twice the rate of human-written code.7
GitClear's research on code quality tells the structural story. Refactoring, the act of improving existing code, has dropped from 25% to under 10% of changed lines. Code cloning, copying and pasting existing patterns without understanding them, has risen from 8.3% to 12.3%.3 The codebase is getting bigger. It is not getting better. Teams are adding more code and understanding less of it.
The vulnerability trend is accelerating. In January 2026, 6 new CVEs were traced directly to AI-generated code. By March 2026, that number was 35.10 Nearly six times more in two months. The dark code is not just accumulating. It is breaking.
The Skill Erosion
The people who would catch dark code are losing the ability to do so.
Shen and Tamkin ran a randomized study on how AI tools affect developer skill formation. Developers who used AI assistance scored 17% lower on code comprehension. The largest gap was in debugging: AI-assisted developers scored 50% on debugging quizzes versus 67% for unassisted developers.4 The tool that writes the code is degrading the ability to understand the code.
METR studied experienced developers specifically. Not juniors. Not students. Developers with years of production experience. With AI tools, they were 19% slower. But they believed they were 24% faster.5
Read that again. A 43-point gap between perceived and actual performance. Experienced engineers thought they were getting a quarter faster. They were getting a fifth slower. They could not tell the difference.
GitClear's analysis of "durable code," lines that survive 14 days without being reverted or rewritten, shows a 10% net productivity gain despite 90% tool adoption.3 Ninety percent adoption. Ten percent gain. The gap between those two numbers is the hidden cost. It is the time spent generating code that gets thrown away, reviewing code that should not have been written, and debugging code that nobody understands.
The perception gap is the most dangerous finding in all of this research. Developers do not know they are getting worse. They feel faster. The metrics say otherwise. And dark code thrives in exactly this gap: where confidence is high and comprehension is low.
They believed they were 24% faster.
The gap is where dark code lives.
The Tribal Knowledge Problem
Meta recently published a detailed account of an internal engineering effort to map tribal knowledge across their codebase.6 They built over 50 AI agents to crawl 4,100+ files and document the unwritten rules, the design choices, the "why" behind the code.
What they found was revealing. Only 5% of those files had any documentation at all. The agents discovered 50+ "non-obvious patterns," design decisions that lived exclusively in engineers' heads. Without this context, AI agents burned 15 to 25 tool calls per task and produced what Meta called "subtly incorrect code."
This was code written by humans, where the knowledge existed somewhere, in someone's head, on some team. Meta spent massive engineering resources just to extract it.
Dark code multiplies this problem by an order of magnitude. With dark code, the knowledge was never in anyone's head to begin with. There is no senior engineer who remembers why that function handles edge cases the way it does. There is no Slack thread from 2024 explaining the tradeoff. There is no tribal knowledge to extract because no tribe ever held it. The AI generated the code. The human approved it. The reasoning, if it existed at all, lived in a context window that was garbage-collected seconds later.
When the next engineer touches that code, they start from zero. And when they use an AI tool to help them understand it, that tool will also start from zero, and may generate its own dark code on top of the existing dark code. Layer upon layer of code that nobody understands, each layer making the next layer harder to comprehend.
The Accountability Vacuum
When dark code fails in production, who is accountable?
The engineer who wrote the prompt? They did not write the code. The reviewer who approved a 500-line diff they did not fully read? They were reviewing 50 other PRs that week. The PM who set the velocity target that made thorough review impossible? The CTO who mandated "AI-first development"?
Survey after survey points the same direction: nearly every organization has AI-generated code in production, and only a small minority report real visibility into where it lives.9 The code is everywhere. The accountability is nowhere.
This is not a hypothetical governance problem. It is a live liability gap. When a vulnerability in dark code gets exploited, the incident response team will open the file, read the code, and not understand it. They will check the git blame and find a developer who prompted an AI to write it eight months ago and has since moved to another team. They will check the PR review and find a single "LGTM" from a reviewer who was processing 30 reviews that day. They will check the documentation and find nothing, because there is nothing to find.
The vulnerability existed from the moment the code was generated. Nobody knew because nobody understood the code. That is the accountability vacuum. Not malice. Not negligence in the traditional sense. Just a process that no longer requires anyone to understand what gets shipped.
What Dark Code Breaks
The instinct is to reach for tooling. Better observability. Better CI/CD pipelines. Better static analysis. More tests.
These help. They are not sufficient.
Observability tells you what dark code breaks. It does not tell you what dark code does. You can detect the failure. You cannot understand the code that failed. Better pipelines add layers of automated verification without solving the comprehension gap. The code still runs. Humans still do not understand it. The verification layer just means the code passed more automated checks before entering the comprehension vacuum.
The most tempting response is acceptance. AI-generated code is getting better. Models are improving. Maybe we do not need to understand every line. Maybe the role of the human shifts from comprehension to supervision. Maybe dark code is just how software gets built now.
This is exactly when it is most dangerous.
Because the skill erosion and the dark code accumulation form a feedback loop. The worse the skill erosion gets, the less anyone notices the dark code accumulating. Developers who cannot tell they are 19% slower certainly cannot tell that the codebase is becoming less comprehensible. The baseline shifts. What felt wrong last year feels normal this year. The dark code becomes the floor everyone stands on without knowing what holds it up.
And when something breaks, and it will break, the response will require exactly the comprehension skills that the process has been eroding. You will need someone who can read the code, understand it, reason about it, and fix it. You will need someone who has been practicing those skills. You will have fewer of those people every quarter.
the less anyone notices the dark code accumulating.
The Debt You Cannot Pay
Tech debt accrues when you understand the shortcut and choose it anyway. You know the tradeoff. You accept the cost. You plan to pay it down later. The debt is legible. You can read it, estimate it, prioritize it.
Dark code accrues when you never understood it in the first place. There is no shortcut because there was no longer path. There is no tradeoff because nobody evaluated the alternatives. There is no plan to pay it down because nobody knows what is owed.
One you can pay down. The other you cannot, because you do not know what you owe.
93% of enterprises have AI code in production. 81% do not know where. 53% of it ships with vulnerabilities. 84% of the developers writing it do not trust it. The developers who would catch the problems are measurably losing the skills to do so, and they cannot tell.
The dark code is accumulating. Nobody is tracking it. The first step is having a name for it.
Now it has one.
Disclosure
This article was written with the assistance of Claude, an AI made by Anthropic. We are aware of the irony of using AI to write about code that nobody understands. We reviewed every line. Whether that is sufficient is, in a sense, the entire point of the article. Corrections and reader perspectives welcome at bustah_oa@sloppish.com.
Sources
- Stackademic, developer survey on AI coding tool usage. 84% adoption rate, 29% trust in AI-generated output. Link.
- Veracode, State of Software Security 2025. Analysis of 4 million code scans; AI-generated code contains vulnerabilities 45% of the time, approximately 2x the rate of human-written code. Link.
- GitClear, "AI Coding Assistants and Code Quality" research series. Refactoring dropped from 25% to under 10% of changed lines; code cloning rose from 8.3% to 12.3%; 10% durable code productivity gain despite 90% adoption. Link.
- Shen, Q. and Tamkin, A., "How AI Impacts Skill Formation in Software Development," 2025. Randomized study: AI-assisted developers scored 17% lower on comprehension; debugging gap of 50% vs. 67%. arxiv.org/abs/2601.20245.
- METR, "Measuring the Impact of AI Coding Tools on Developer Productivity," 2025. Experienced developers 19% slower with AI tools; self-reported perception: 24% faster. Link.
- Meta Engineering Blog, "Mapping Tribal Knowledge with AI Agents," 2026. 50+ agents, 4,100+ files, only 5% documented; 50+ non-obvious patterns discovered; agents burned 15-25 tool calls per task without context. Link.
- AI Vyuh, code security analysis. 53% of AI-generated code ships with vulnerabilities. Link.
- Growexx enterprise survey on AI-generated code in production. 93% of enterprises have AI code in production; 81% lack visibility into where; 100% of CISOs surveyed confirmed AI code presence. Link.
- AI-generated code CVE tracking. 35 new CVEs in March 2026 traced directly to AI-generated code, up from 6 in January 2026. Link.
- Tobias Brunner, "An AI Vibe Coding Horror Story." Medical patient management system built with AI exposed all patient data unencrypted on the open internet. Full read/write access obtained in 30 minutes. Swiss privacy law violations likely. Link.
