Ophthalmologists were able to discern human from AI-generated responses with 61 percent accuracy
Sarah Healey | | 2 min read | News
Large language models (LLMs), such as BERT and GPT-3, have transformed how computers process and understand language. Trained on vast amounts of text, these models have found diverse applications in healthcare, particularly in specialties such as radiology that are data-heavy and image-focused (1).
One standout among these models is ChatGPT, an AI-powered LLM developed by OpenAI and built on GPT-3.5. Since its release in November 2022, the chatbot has worked to simplify radiology reports, write discharge summaries, and transcribe patient notes (2).
Despite revolutionizing the medical landscape, a rise in medical misinformation has caused a major concern about the safety and reliability of LLM chatbots, especially in niche specialties such as ophthalmology.
To better explore the clinical effectiveness of AI-powered technologies, US-based researchers compared the quality of ophthalmic advice generated by an LMM chatbot with those of board-certified ophthalmologists (2).
Patient questions were selected from over 700 pages of The Eye Care Forum, which provides users with answers from physicians affiliated with the American Academy of Ophthalmology (AAO). These questions were subsequently put into ChatGPT, and added to a data set ensuring each question had an answer from a human ophthalmologist and the chatbot.
In total, 200 pairs of questions and answers were evaluated. A board of eight ophthalmologists were able to discern human from AI-generated responses with 61 percent accuracy. However, chatbot answers were not perceived to be significantly more harmful than human responses in relation to incorrect information, likelihood and extent of causing harm, and agreement with perceived consensus in the medical community.
The results seem to suggest that LLMS can, in fact, provide astute and appropriate ophthalmic advice to patient queries of varying complexity. So, will there come a time when AI technologies fully replace physicians for some tasks?
- H Grewal et al., “Radiology Gets Chatty: The ChatGPT Saga Unfolds,” Cureus, [Online ahead of print] (2023). PMID: 37425598.
- I A Bernstein et al., “Comparison of Ophthalmologist and Large Language Model Chatbot Responses to Online Patient Eye Care Questions,” JAMA Netw Open, [Online ahead of print] (2023). PMID: 37606922.