Nashville News Post

collapse
Home / Daily News Analysis / ChatGPT, Gemini, and other AI bots give bad medical tips half the time

ChatGPT, Gemini, and other AI bots give bad medical tips half the time

Apr 15, 2026  Twila Rosenbaum  38 views
ChatGPT, Gemini, and other AI bots give bad medical tips half the time

As people increasingly turn to AI chatbots for everyday health inquiries, a new study raises alarms about the reliability of these digital advisors. The research indicates that nearly half of the responses from five prominent AI chatbots were deemed problematic, despite their confident and polished presentation.

The study evaluated ChatGPT, Gemini, Grok, Meta AI, and DeepSeek against 250 prompts covering topics such as cancer, vaccines, stem cell research, nutrition, and athletic performance. These prompts were designed to reflect common health questions and prevalent misinformation, allowing researchers to assess whether the chatbots adhered to scientific evidence or veered into misleading and potentially harmful territory.

Open-ended Questions Reveal Significant Flaws

The analysis highlighted that broader, open-ended prompts resulted in the most troubling answers. The chatbots struggled significantly with these types of inquiries, often providing misleading information. In contrast, closed prompts—those with specific, well-defined questions—tended to yield safer, more reliable responses.

This distinction is crucial, as individuals typically do not pose medical questions in a structured format; they seek clarity on whether a treatment is effective, if a vaccine is safe, or what can enhance athletic performance. The study found that such prompts frequently led to responses that combined well-founded evidence with weaker, misleading claims.

Confidence in Answers, Weakness in References

The shortcomings of the chatbots extended beyond the quality of their answers. The references provided by these systems were often inadequate, with an average completeness score of only 40%. Notably, none of the chatbots produced a fully accurate reference list, which undermines one of the primary reasons users might trust these AI-generated responses. A seemingly authoritative reply could quickly fall apart upon scrutiny of its citations.

Additionally, the researchers identified instances of fabricated references, highlighting a concerning trend where AI chatbots present information with a high degree of certainty while offering few disclaimers about their reliability.

Implications of the Findings

While there are limitations to the study—such as its focus on only five chatbots, the rapid evolution of these technologies, and the stress-testing nature of the prompts—it remains clear that the findings are significant. The chatbots were assessed on evidence-based medical topics, and yet half of their answers strayed into flawed or incomplete areas.

The overarching conclusion is striking: while AI chatbots may assist users in summarizing information or formulating follow-up questions, they currently lack the dependability required for critical medical decision-making. Until improvements are made, individuals should exercise caution when relying on these digital tools for health-related inquiries.


Source: Digital Trends News


Share:

Your experience on this site will be improved by allowing cookies Cookie Policy