Mozilla and Claude Mythos: 271 Firefox Vulnerabilities Revealed by AI

⚡

Key Takeaways

1Mozilla used Claude Mythos Preview to identify 271 vulnerabilities in Firefox 150, contributing to the resolution of 423 security issues in April.

2Claude Mythos's agentic systems helped reduce false positives by running their own tests to validate bugs.

3Bugs that were 15 to 20 years old were discovered, enhancing the credibility of the AI's findings.

💡Why it matters — The integration of AI into Firefox development improves security and the efficiency of bug fixes.

A Major Breakthrough in Vulnerability Detection

Mozilla's development teams have recently utilized Claude Mythos Preview, an artificial intelligence technology, to identify an impressive 271 previously unknown security vulnerabilities in Firefox 150. This effort enabled Mozilla to resolve a total of 423 security issues in April, setting a new record compared to the 76 issues fixed in March.

Unlike previous AI models that generated numerous false positives, the new agentic systems have the capability to create and execute their own test cases. This allows for the actual existence of a suspected bug to be verified before it is reported, significantly reducing the number of false positives.

A Detailed and Methodical Discovery

In a blog post on Mozilla Hacks, three Firefox developers explained how the use of Claude Mythos Preview led to the discovery and correction of 271 unknown security vulnerabilities in Firefox 150. In April, Mozilla was able to resolve 423 security issues, a figure well above the previous record of 76 in March.

The distribution of discoveries highlights the importance of Mythos Preview:

271 bugs were found in Firefox 150.
About one-third of the remaining 111 bugs were also discovered through Mythos executions.
The remaining two-thirds were identified by other models and traditional testing methods like fuzzing.
Only 41 of the total 423 vulnerabilities came from external reports.

A few months ago, bug reports generated by AI were often considered unreliable, as they seemed plausible but frequently turned out to be false, wasting developers' time. Two factors have changed this perception: more effective models and better infrastructure to distinguish true discoveries from noise.

The Impact of Agentic Pipelines and Claude Mythos

Previous attempts to analyze code with models like GPT-4 and Claude Sonnet 3.5 failed due to numerous false positives. The breakthrough came from agentic systems, which allow AI to build and execute its own test cases to verify the actual existence of a suspected bug. This automatic verification effectively filters out speculation.

Mozilla started with Claude Opus 4.6 during small manually supervised executions before expanding the process to numerous virtual machines, each checking a single file in parallel. A pipeline was built around this system to deduplicate reports, prioritize discoveries, and track fixes until their publication.

In February, Anthropic's Frontier Red Team reported an initial batch of vulnerabilities to Mozilla, which directly led to the establishment of the pipeline that Mozilla currently uses.

To enhance the credibility of the discoveries, Mozilla released some bug reports earlier than expected. Among these discoveries:

A 15-year-old bug in the HTML label element used for form descriptions.
A 20-year-old bug in the XML tool XSLT.
Several methods to bypass the sandbox, the security mechanism that isolates websites from the rest of the system.

A striking example is that of an HTML table with over 65,535 rows, which caused an overflow of an internal counter. Even Mozilla's additional sandbox for third-party libraries, called RLBox, was bypassed.

Validating Existing Defenses

What the models could not do proved equally revealing. Several attack attempts targeted a technique called Prototype Pollution, which attackers had previously used to escape the sandbox. These attempts failed due to an architectural decision made by Mozilla years ago. For developers, having direct proof that their existing defenses still hold was just as valuable as finding new vulnerabilities.

Many of the discovered vulnerabilities alone are not sufficient for a complete attack—they would need to be chained with other flaws. But these are exactly the kind of weaknesses that traditional testing methods like fuzzing struggle to detect, and AI analysis covers this ground much more thoroughly. In the future, Mozilla plans to integrate the pipeline directly into its development process so that every new piece of code is automatically checked before it is committed.

Mozilla and Claude Mythos: 271 Firefox Vulnerabilities Revealed by AI

Le brief IA que les pros lisent chaque soir

A Major Breakthrough in Vulnerability Detection

A Detailed and Methodical Discovery

The Impact of Agentic Pipelines and Claude Mythos

Validating Existing Defenses

Brief IA — L'actualité IA en français