Podcast Episode
Evaluated on hundreds of real emergency department cases, MIRA outperformed a panel of physicians in diagnostic accuracy. But it ordered roughly twice as many blood tests as the doctors did. Dr Wei Xing, an assistant professor at the University of Sheffield, cautioned that MIRA's advantage "is mostly driven by conditions with clear test results, like appendicitis and pancreatitis," while for common conditions like pneumonia and urinary tract infections "the gap between them was smallest."
Built on the long-context capabilities of Google's Gemini models, AMIE pairs an empathetic dialogue agent with a management reasoning system that cross-references hundreds of pages of clinical knowledge. However, Alfonso Valencia, director of Life Sciences at the Barcelona Supercomputing Centre, noted that unlike MIRA, AMIE is not open-source, making independent evaluation impossible.
Both systems are text-based, meaning they cannot perform physical examinations or read tone of voice and body language. Prof. Catherine Pope of the University of Oxford said the systems "can mimic some aspects of experienced physician performance," but much more research is needed, and "these technologies are unlikely to replace doctors." Google said it has launched a nationwide study to assess AMIE in real-world virtual care settings.
Two AI Systems Match or Beat Doctors in Nature Studies
June 18, 2026
0:00
5:41
Two clinical AI systems, MIRA and Google's AMIE, performed at or above the level of physicians in simulated medical settings, according to a pair of studies published in Nature. Both showed striking diagnostic and management results, but experts stress neither has been tested on real patients in live clinical environments.
Two Medical AIs Hit a New Benchmark
Two artificial intelligence systems built for clinical medicine have performed at or above the level of human physicians in simulated settings, according to a pair of studies published Wednesday in Nature. The results mark a new benchmark for medical AI, but they arrive with heavy caveats: neither system has yet been tested with real patients in live clinical environments.MIRA Works Inside the Health Record
The first system, MIRA, was developed by a team led by researchers at the Else Kröner Fresenius Centre for Digital Health. Unlike earlier medical AI that simply answers questions, MIRA is an autonomous agent that operates within a simulated electronic health record. It can conduct patient interviews, order diagnostic tests, prescribe medications, and recommend hospital admissions.Evaluated on hundreds of real emergency department cases, MIRA outperformed a panel of physicians in diagnostic accuracy. But it ordered roughly twice as many blood tests as the doctors did. Dr Wei Xing, an assistant professor at the University of Sheffield, cautioned that MIRA's advantage "is mostly driven by conditions with clear test results, like appendicitis and pancreatitis," while for common conditions like pneumonia and urinary tract infections "the gap between them was smallest."
Google's AMIE Tackles Long-Term Care
The second study evaluated AMIE, Google's Articulate Medical Intelligence Explorer, on the harder task of managing chronic conditions across multiple visits. AMIE was compared with 21 primary care physicians across 100 multi-visit scenarios based on UK clinical guidelines. It matched the clinicians in overall management reasoning and scored higher in plan preciseness and guideline alignment.Built on the long-context capabilities of Google's Gemini models, AMIE pairs an empathetic dialogue agent with a management reasoning system that cross-references hundreds of pages of clinical knowledge. However, Alfonso Valencia, director of Life Sciences at the Barcelona Supercomputing Centre, noted that unlike MIRA, AMIE is not open-source, making independent evaluation impossible.
Experts Urge Caution
Researchers and independent commentators uniformly stressed that both systems remain far from clinical deployment. "These technologies are not yet ready for autonomous use in clinical practice," said Ignacio Miranda Gómez of the International Breast Cancer Centre in Barcelona, noting that the studies used simulated patients in controlled environments, so safety and efficacy still need to be proven in real hospitals.Both systems are text-based, meaning they cannot perform physical examinations or read tone of voice and body language. Prof. Catherine Pope of the University of Oxford said the systems "can mimic some aspects of experienced physician performance," but much more research is needed, and "these technologies are unlikely to replace doctors." Google said it has launched a nationwide study to assess AMIE in real-world virtual care settings.
Published June 18, 2026 at 6:10pm