Harvard: AI Outperforms Doctors in Emergency Diagnosis
Le brief IA que les pros lisent chaque soir
Les 7 actus IA du jour, décryptées en 5 min. Gratuit.
Inclus dès l'inscription : notre sélection des meilleurs guides & comparatifs IA.
Choisis ton rythme
Gratuit · Pas de spam · Désabonnement en 1 clic
A Harvard Study on AI and Medical Diagnostics
A recent study conducted by researchers from Harvard Medical School and Beth Israel Deaconess Medical Center has shed light on the performance of large language models in the medical field. Published in the journal Science, this research evaluated the ability of these models to provide diagnostics in real-world contexts, particularly in emergency departments.
Experiments in the Emergency Department
The study focused on 76 patients admitted to the emergency department at Beth Israel. The diagnoses made by two physicians were compared to those generated by OpenAI's models o1 and 4o. These diagnoses were then assessed by two other physicians, who were unaware of their origin, whether from AI or humans. This approach aimed to ensure an impartial evaluation of performance.
Results of Model o1
The results showed that model o1 often outperformed or matched the performance of the physicians, especially during the initial triage, a critical moment when patient information is limited and decisions must be made quickly. Model o1 managed to provide an accurate or very close diagnosis in 67% of cases, while the two physicians achieved this goal in 55% and 50% of cases, respectively.
No Data Preprocessing
The researchers emphasized that they did not preprocess the data before providing it to the AI models. The information used was that available in the electronic medical records at the time of each diagnosis. Arjun Manrai, who leads an AI lab at Harvard Medical School and is one of the study's lead authors, stated that the AI was tested against various benchmarks and surpassed both previous models and the physicians.
Need for Prospective Trials
Although the results are promising, the study does not claim that AI is ready to make critical decisions in emergency settings. The researchers stress the need for prospective trials to assess the effectiveness of these technologies in real patient care contexts. They note that current models have only been evaluated with textual data, and existing studies suggest that these models are more limited in their reasoning about non-textual inputs.
Perspectives and Limitations
Adam Rodman, a physician at Beth Israel and also one of the study's lead authors, stated that there is currently no formal framework for accountability regarding AI diagnostics. He added that patients still want humans to guide them through life-or-death decisions and difficult treatment choices. This study paves the way for deeper reflection on the integration of AI in the medical field, while highlighting the challenges that must be addressed to ensure the safe and effective use of these technologies.
Brief IA — L'actualité IA en français
L'essentiel de l'actualité de l'intelligence artificielle, décrypté et expliqué chaque jour.