ChatGPT Bombs

( – A new study published in JAMA Pediatrics has revealed significant limitations in the capabilities of ChatGPT, a popular AI chatbot, particularly in diagnosis.

Researchers tested the AI using 100 pediatric case challenges from JAMA and the New England Journal of Medicine, feeding these into ChatGPT version 3.5 with a prompt to provide differential and final diagnoses.

The cases, all from the last decade, were used to assess ChatGPT’s diagnostic accuracy against physicians’ diagnoses. The results were sobering, with the AI displaying an error rate of 83 percent. This included 72 percent of diagnoses being outright incorrect and 11 percent being too broad to be considered correct.

While the high error rate is a concern, the study suggested the potential of large language models like ChatGPT in administrative roles in healthcare. The researchers pointed out that to enhance ChatGPT’s diagnostic accuracy, more targeted training might be necessary. They noted the AI’s inability to identify certain relationships, such as between autism and vitamin deficiencies, as a significant limitation.

One critical factor highlighted in the study was the AI’s lack of access to current data. ChatGPT’s knowledge base isn’t regularly updated, meaning it doesn’t have the latest information on health trends, diagnostic criteria, or emerging diseases.

The exploration of AI in medicine is not new. Another study published last year found that OpenAI’s GPT-4 outperformed clinicians in diagnosing patients over 65, although this study had a small sample size of just six patients. Researchers in this study saw potential in AI chatbots boosting confidence in diagnoses.

The broader field of AI diagnostics is advancing, with the FDA having approved numerous AI-enabled medical devices. However, as of now, none of these approved devices use generative AI or are powered by large language models like ChatGPT. This study underscores the ongoing need for careful evaluation and development of AI tools in healthcare, balancing the promise of new technology with the rigor of medical standards.