Brief IA

Microsoft: 16 Windows Vulnerabilities Discovered Using MDASH AI

🤖 Models & LLM·Tom Levy·

Microsoft: 16 Windows Vulnerabilities Discovered Using MDASH AI

Microsoft: 16 Windows Vulnerabilities Discovered Using MDASH AI
Key Takeaways
1Microsoft has detected 16 Windows vulnerabilities, including 4 critical ones, thanks to its AI system MDASH.
2MDASH utilizes over 100 specialized AI agents, surpassing single models like GPT-5.5 and Claude Mythos.
3The system has demonstrated an effectiveness of 96% to 100% on key security benchmarks.
💡Why it mattersMDASH represents a significant advancement in cybersecurity, providing a robust alternative to closed and expensive models.
Le brief IA que lisent les pros

Le brief IA que les pros lisent chaque soir

Les 7 actus IA du jour, décryptées en 5 min. Gratuit.

Inclus dès l'inscription : notre sélection des meilleurs guides & comparatifs IA.

Choisis ton rythme

Gratuit · Pas de spam · Désabonnement en 1 clic

📄
Full Analysis

Microsoft recently unveiled a revolutionary cybersecurity system based on artificial intelligence, named MDASH (Multi-Model Agentic Scanning Harness). This innovative system has uncovered 16 vulnerabilities in Windows, four of which are classified as critical. Unlike traditional approaches that rely on single models, MDASH integrates over 100 specialized AI agents, combining models from different generations for enhanced effectiveness.

In April 2026, Anthropic launched Project Glasswing and its Claude Mythos model, designed for security flaw detection. Shortly after, OpenAI introduced Daybreak, a cyber initiative based on GPT-5.5. Both of these approaches rely on a single model, while Microsoft has opted for a different strategy. On May 12, Microsoft's agentic cybersecurity team announced the results of MDASH. This system utilizes over 100 specialized AI agents across a set of "frontier" models and distilled models, without depending on a single provider. As Taesoo Kim, Vice President of Agentic Security at Microsoft, explains: "The model is one input among others. The system is the product."

MDASH has been applied to the Windows network and authentication stack, one of the most monitored attack surfaces in the world. The result: 16 previously unknown vulnerabilities included in the May 2026 Patch Tuesday, with 10 in kernel mode and 6 in user mode. The majority of these flaws are exploitable from the network, without prior authentication. Four of them are classified as critical:

  • A double-free in the IKEv2 service (CVE-2026-33824, CVSS score 9.8)
  • A use-after-free in the IPv4 TCP/IP stack (CVE-2026-33827, CVSS 8.1)
  • Two flaws in Netlogon and the Windows DNS client, both rated 9.8.

The first two flaws are deemed by Microsoft to be "more likely to be exploited." These vulnerabilities had withstood years of human audits and millions of fuzzing passes. For instance, the double-free in IKEv2 only becomes visible when comparing a poorly managed code site with a correctly implemented site elsewhere in the same source file. This type of contrast reasoning is precisely what traditional scanners cannot do, and what MDASH's "debating" agents are designed to automate.

The system's operation resembles more of an auditing firm than a traditional scanner. "Auditor" agents sift through the source code and flag suspicious areas. "Debating" agents challenge each report (a lightweight model contradicts a heavyweight model, and vice versa). A final group of "proving" agents attempts to construct a functional exploit before a human engineer takes over. On a private test driver containing 21 planted vulnerabilities, MDASH found all 21 without any false positives. Over five years of vulnerabilities confirmed by the MSRC in the clfs.sys component, the system shows a 96% recall rate. For tcpip.sys, it's 100%.

On the public benchmark CyberGym (UC Berkeley, 1,507 vulnerability reproduction tasks across 188 open-source projects), MDASH scores 88.45%, about 5 points ahead of Claude Mythos Preview (83.1%) and GPT-5.5 (81.8%). The gap is significant, but the most interesting aspect lies elsewhere. Microsoft is simultaneously a partner of OpenAI, a founding member of Project Glasswing (the Anthropic program deploying Mythos to 40 organizations), and now an operator of its own competing system.

The ACS team that designed MDASH does not hail from a Microsoft lab. Several of its members come from Team Atlanta, winners of the $29.5 million DARPA AI Cyber Challenge in 2024. This team had developed an autonomous system capable of finding and patching real bugs in open-source projects. The transition from academic research to industrial production is the true challenge of MDASH: the system will be available in private beta for companies starting in June 2026.

Microsoft is also not a pioneer in the multi-model cybersecurity field. In April, the startup AISLE had already replicated Mythos's flagship results with open-source models of 3-5 billion parameters. The cost of the operation: $0.10 per million tokens, where Anthropic charges 240 times more. MDASH emphasizes that what makes the difference is the architecture around the model, not its raw power. For Europe, this is paradoxically good news. Anthropic reserves Mythos for a handful of American partners, and OpenAI locks Daybreak behind a verified access program. However, the obligations of the Cyber Resilience Act and NIS2 will not wait. With open models (Mistral, Llama, Qwen) and solid orchestration engineering, nothing prevents a European player from building a competitive pipeline without relying on closed models to which they do not have access anyway.

Brief IA — L'actualité IA en français

L'essentiel de l'actualité de l'intelligence artificielle, décrypté et expliqué chaque jour.