Over the past few years, many researchers have tried to test large language models’ medical capabilities using questions from the U.S. medical licensure exam, or by feeding them clinical vignettes containing a set amount of information about a given case. Other researchers have called for an end to those exercises, saying that these examples are overly simplified and not characteristic of actual medical practice.
At the end of June, Microsoft’s AI team released a study on a new way to structure AI agents for diagnosing disease — one that was able to diagnose difficult cases at a rate four times higher than clinicians could. Those results led the company to claim it is on the path to “medical superintelligence.”
The claims and hyperbolic headlines have stirred controversy among physicians online. The “superintelligence” buzzword misses the mark, experts told STAT, and overlooks the actual innovations Microsoft made in the process.
This article is exclusive to STAT+ subscribers
Unlock this article — and get additional analysis of the technologies disrupting health care — by subscribing to STAT+.
Already have an account? Log in
View All Plans