OpenAI and Harvard: AI Redefines Medical Diagnosis

Le brief IA que les pros lisent chaque soir
Les 7 actus IA du jour, décryptées en 5 min. Gratuit.
Inclus dès l'inscription : notre sélection des meilleurs guides & comparatifs IA.
Choisis ton rythme
Gratuit · Pas de spam · Désabonnement en 1 clic
AI in the Service of Diagnosing Rare Genetic Diseases
A study published on June 18, 2026, in NEJM AI highlights the innovative use of artificial intelligence to assist doctors in diagnosing rare genetic diseases in children. Researchers utilized a reasoning model developed by OpenAI to re-examine 376 cases that had previously not yielded conclusive diagnoses. Thanks to this technology, 18 new diagnoses were identified.
Despite advancements in genomic sequencing, a significant number of patients with rare diseases do not receive an accurate genetic diagnosis. Approximately 50% of cases remain unanswered, even after extensive testing and consultations with specialists. Medical records, often fragmented and complex, contain valuable clues, but their analysis requires sifting through a large number of genetic variants and clinical data.
The Evolution of Scientific Knowledge
Knowledge about the relationships between genes and diseases is constantly evolving. New discoveries and classifications can make previously unresolved cases interpretable. Researchers from Boston Children's Hospital, the Manton Center for Orphan Disease Research, Harvard University, and OpenAI collaborated to utilize the OpenAI o3 Deep Research model. This model analyzed anonymized clinical and genomic data, highlighting potential explanations for researchers. After expert review and additional testing, 18 diagnoses were confirmed, increasing the diagnostic yield by 4.8%.
The Importance of Reanalysis
An inconclusive genetic test does not necessarily mean a definitive dead end. Phenotypic descriptions, test results, and family histories may be scattered across different databases, making their linkage complex. Even specialists may overlook a diagnosis if a relevant gene has not yet been linked to a disease. As science progresses, the same data can reveal answers that were previously inaccessible.
Reanalyzing rare diseases presents both scientific and logistical challenges. While a patient's genome remains unchanged, the evidence surrounding it is constantly evolving. Researchers are discovering new links between genes and diseases, reclassifying variants, and databases are enriched with new observations. Each update can transform an unresolved case into a diagnostic opportunity.
The Reanalysis Process
For each case, the team compiled an anonymized file including standardized terms from the Human Phenotype Ontology to describe the patient's clinical presentation, clinician notes, and metadata such as age and sex. A filtered variant table was also included, capturing the rarity of each variant and its potential impact on the coded protein.
The OpenAI model was tasked with proposing the most plausible molecular explanation, justifying its reasoning. The results were reviewed by researchers using the ACMG/AMP framework, commonly employed in clinical laboratories to classify genetic variants. Each candidate was evaluated by at least two team members, and disagreements were resolved by consensus. A diagnosis was validated only after confirmation by a CLIA-certified laboratory and the return of results to the concerned family.
Before tackling unresolved cases, the team refined its approach by testing the workflow on cases with already established diagnoses. They successfully identified the gene and variant in 48 out of 51 cases studied, covering a variety of rare conditions. In a set of 57 neuromuscular cases, the workflow correctly identified the right diagnosis in 45 cases. For a group of 15 long-read genome cases, the model identified the correct gene in each case and both alleles causing the disease in 12 cases. These tests were crucial for refining the process and demonstrating the importance of expert review.
Results and Implications
The team applied this workflow to four groups of unresolved cases: children with neurodevelopmental disorders, individuals with rare neuromuscular diseases, children and adolescents with early psychosis, and cases of sudden death in pediatrics. These cases had already been examined by various commercial and institutional pipelines.
Details by Cohort
- Neurodevelopmental: 100 cases, 10 diagnoses, yield of 10.0%
- Neuromuscular Disease: 61 cases, 4 diagnoses, yield of 6.6%
- Sudden Death in Pediatrics: 20 cases, 0 diagnoses, yield of 1.0%
- Early Psychosis: 15 cases, 2 diagnoses, yield of 13.3%
The 4.8% rate of established diagnoses is modest but significant, especially in a population where previous examinations had not yielded results. Similar studies report modest gains in cases that have already been extensively examined.
Among the 18 diagnoses, 7 were rediscoveries: diagnoses established outside the local research workflow but absent from the records reviewed by the team. In several cases, the variants were already listed as pathogenic or likely pathogenic in public databases, highlighting the operational challenge of synthesizing information across data sources.
A Concrete Example: Kyra's Case
The journey of Kyra, a 9-year-old girl, illustrates the impact of this technology. Her mother noticed changes in her physical performance during sports activities. After numerous consultations and tests over nearly 20 years, a diagnosis was finally established thanks to AI. Her condition was linked to a variant in the HSPB8 gene, responsible for a form of myofibrillar myopathy.
This case demonstrates how AI can offer new perspectives for complex diagnoses, potentially transforming the medical landscape for rare genetic diseases.
Brief IA — L'actualité IA en français
L'essentiel de l'actualité de l'intelligence artificielle, décrypté et expliqué chaque jour.