Recently, Anthropic's highly anticipated AI security analysis model, Mythos, faced an unexpected "defeat" in the open-source community. Although the company had previously promoted the model's extraordinary ability to detect source code vulnerabilities, even delaying its public release, the test on the globally renowned open-source tool curl revealed a somewhat disappointing result: after a rigorous scan of 176,000 lines of code, only one low-risk vulnerability was confirmed.

The test was initiated by Daniel Stenberg, the founder of the curl project. He obtained limited testing access to Mythos through related projects, aiming to conduct a thorough "health check" for this widely used network transfer tool with over 200 billion installations. The curl codebase is known for its extremely high security engineering standards, having been meticulously refined by hundreds of contributors and continuously subjected to various automated scans and expensive professional audits.

The initial phase of the test appeared promising. Mythos' initial report claimed to have discovered "5 confirmed security vulnerabilities," but after several hours of manual verification by the curl security team, these results quickly shrank: three were determined to be false positives, simply reflecting normal behavior as described in the documentation; one was classified as a regular bug without security threats. In the end, only one vulnerability remained, rated as "low" in severity.

Regarding this outcome, Stenberg openly pointed out that Anthropic's so-called "dangerous capabilities" seemed more like a successful marketing campaign. He stated that long before Mythos, the curl team had already fixed hundreds of bugs using multiple AI security tools, and the first tools often picked up "low-hanging fruits." As the codebase became increasingly refined, it has become significantly harder for AI to uncover deep-seated new vulnerabilities.

However, Stenberg did not entirely dismiss the value of AI. He acknowledged that compared to traditional static analyzers, AI tools like Mythos have significant advantages in understanding protocol specifications, identifying discrepancies between comments and code, and simulating configuration checks in complex environments. They are more like a knowledgeable and skilled assistant, although their suggested fixes are not always 100% accurate.

This real-world test sent a warning to the industry: although AI has brought a qualitative improvement in code auditing, it can currently only detect "known types" of errors, rather than creating entirely new logic for vulnerability detection. In ensuring core security, rigorous security engineering practices—such as building defensive infrastructures and strict numerical limit restrictions—remain a more reliable "silver bullet" than AI tools.