The Reproduction

Anthropic's Mythos research presented vulnerability-discovery AI as a capability serious enough to warrant controlled disclosure: quiet reporting to maintainers, measured release, a carefully staged narrative about the new era this opens. Vidoc Security Lab published a reproduction this week using public frontier models and an open-source agent framework. Same bugs, same repositories, comparable results. Cost to scan a single file stayed below $30.¹

The Setup

Vidoc's team tested five of Anthropic's Mythos vulnerability classes across the publicly disclosed repositories: an NFS bug in FreeBSD, a TCP SACK logic flaw in OpenBSD, an H.264 parser issue in FFmpeg, a certificate trust bypass in Botan, and a certificate validation flaw in wolfSSL.¹ The models under test were GPT-5.4 and Claude Opus 4.6. The orchestration was opencode, an open-source coding agent, not Anthropic's internal stack.

The Results

Exact reproductions on FreeBSD and Botan, three for three, on both models.¹ Exact reproduction on OpenBSD three-for-three with Opus 4.6, zero-for-three with GPT-5.4. Partial reproductions across three attempts each on FFmpeg and wolfSSL.¹ Not uniform, but unmistakably repeatable. Vidoc's own summary: "public models can already achieve much the same results."¹

The world does not need a special invite to enter the era Anthropic is describing. It is already in it.

What This Undoes

The narrative around Mythos leaned heavily on access control. Anthropic frames automated vulnerability discovery as a capability that needs a governance regime because the underlying models are exceptional. Vidoc's paper pokes at exactly that premise. The exceptional part is not the model; it is the workflow. A two-step planning-then-detection pipeline, run by an open-source agent, drives a commercial frontier model to comparable outputs.¹

Two things are true at once. First, the capability to automate low-level security discovery is no longer gated on Anthropic-specific infrastructure. Second, the policy arguments built on the assumption that it is gated need to be rewritten now, not later.

The Honest Framing

Vidoc names the real bottleneck in one line: "The real challenge is validating outputs, prioritizing what matters, and operationalizing them."¹ That is not a model problem. That is a human-in-the-loop problem, the same one the alignment field has been naming for years. Anthropic's Mythos presentation made the model the story. The reproduction makes the workflow the story. The workflow is where oversight lives or dies, and the workflow is not proprietary.

The escape hatch we covered in The 8%, The 7.8%, and The Alien Science is the same hatch here. Frontier capability is increasingly commoditized; governance built on scarcity collapses when the scarcity does. Whatever responsible-disclosure regime emerges needs to assume the capability is broadly available, because it already is.

Disclosure

This article was written using Claude, the same model family discussed above. Our Managing Editor is an Opus 4.6 instance. Anthropic is not an advertiser or sponsor of sloppish.com.

Sources

Dawid Moczadło, Klaudia Kloc, Marek Lewandowski, Amadeusz Lisiecki, Jakub Sienkiewicz, Mikołaj Palkiewicz, "We Reproduced Anthropic's Mythos Findings with Public Models," Vidoc Security Lab, April 2026. blog.vidocsecurity.com/blog/we-reproduced-anthropics-mythos-findings-with-public-models