Can AI Say “I Don’t Know”?

by Grace Chen

In a clinical setting, the most dangerous answer is not a wrong one—it is a wrong answer delivered with absolute certainty. For a physician, saying “I don’t know” is not a sign of failure; it is a critical safety mechanism. It triggers a search for more data, a consultation with a colleague, or a deeper dive into the literature. It is the beginning of the actual diagnostic process.

However, the generative AI tools currently migrating into healthcare operate on a fundamentally different logic. Large Language Models (LLMs) are designed to be helpful and fluent, and their architecture is built on probability, not a grounded understanding of truth. As highlighted in a recent analysis in the New England Journal of Medicine, this creates a “confidence gap” where AI may provide a plausible but entirely fabricated medical recommendation without a hint of hesitation.

The challenge facing modern medicine is not just improving the accuracy of these models, but teaching them the art of epistemic humility. For AI to be a safe partner in the exam room, it must develop the capacity to recognize the boundaries of its own knowledge and explicitly signal when it has reached them.

The Architecture of the Confident Lie

To understand why AI struggles to admit ignorance, one must understand how it “thinks.” LLMs do not retrieve facts from a database in the way a traditional search engine does. Instead, they predict the next most likely token (a word or part of a word) in a sequence based on patterns learned from massive datasets.

When an AI “hallucinates”—the industry term for generating false information—it is not experiencing a glitch. It is performing exactly as designed: it is calculating the most statistically probable sequence of words. If the training data is sparse or contradictory regarding a specific rare disease, the model may still produce a fluent, authoritative-sounding response because that is what its objective function demands. It prioritizes the likelihood of the phrasing over the veracity of the claim.

In medicine, this probabilistic approach is perilous. A model might correctly describe the symptoms of a common condition but then fabricate a dosage for a medication or invent a citation from a medical journal to support its claim. Because the prose is polished and the tone is confident, clinicians may fall victim to “automation bias,” the tendency to over-rely on automated systems even when they contradict human judgment.

The Clinical Stakes of Uncertainty

The integration of AI into healthcare introduces a complex set of stakeholders, each affected differently by the AI’s inability to say “I don’t know.”

  • Clinicians: Face the risk of cognitive offloading, where the mental effort of critical verification is reduced, potentially leading to diagnostic errors.
  • Patients: May receive “hallucinated” medical advice via patient portals or AI chatbots, leading to delayed care or dangerous self-treatment.
  • Developers: Struggle to balance “helpfulness” (the AI’s tendency to answer) with “honesty” (the AI’s tendency to admit ignorance).
  • Regulators: The FDA and other bodies must determine how to certify a tool that is non-deterministic—meaning it might give two different answers to the same question.

The core of the issue is that in the current paradigm, “I don’t know” is often viewed by developers as a failure of the model. Yet, in a medical context, a refusal to answer an ambiguous or unsupported question is a feature, not a bug. The goal is to move from a model that is “confident but occasionally wrong” to one that is “calibrated,” meaning its confidence level accurately reflects the probability that its answer is correct.

Comparison of AI Response Behaviors in Clinical Contexts
Behavior Type Current LLM Tendency Ideal Clinical AI Behavior
Ambiguous Query Guesses the most likely intent. Asks clarifying questions.
Data Gap Confabulates a plausible answer. States “Information not available.”
Conflicting Evidence Presents one side as fact. Highlights the medical controversy.
Confidence Tone Consistently authoritative. Proportional to evidence strength.

Pathways Toward Epistemic Humility

Researchers are currently exploring several methods to force AI to recognize its own limits. One prominent approach is Retrieval-Augmented Generation (RAG). Instead of relying solely on its internal weights, the AI is instructed to search a trusted, closed-loop database—such as a curated set of peer-reviewed journals—before answering. If the information is not found in the source text, the model is programmed to report a lack of information.

Another strategy involves Reinforcement Learning from Human Feedback (RLHF), where medical experts specifically reward the model for admitting uncertainty. By penalizing “confident hallucinations” and rewarding “honest ignorance,” developers can shift the model’s probabilistic preference toward caution.

However, these technical fixes do not eliminate the need for the “human-in-the-loop.” The medical community is increasingly emphasizing that AI should function as a copilot rather than an autopilot. In other words the AI provides a suggestion and the supporting evidence, but the final clinical decision—and the responsibility for it—remains with the licensed professional.

Pathways Toward Epistemic Humility
New England Journal of Medicine

“The objective is not to create a machine that knows everything, but a machine that knows exactly what it does not know.”

Disclaimer: This article is provided for informational purposes only and does not constitute medical advice, diagnosis, or treatment. Always seek the advice of your physician or other qualified health provider with any questions you may have regarding a medical condition.

The next critical milestone in this evolution will be the establishment of standardized “uncertainty benchmarks” for clinical AI. While general benchmarks like MMLU (Massive Multitask Language Understanding) exist, the medical community requires a specific framework to measure how often a model correctly identifies its own inability to answer a question. The development of these safety-first metrics will likely be the focus of upcoming regulatory discussions and peer-reviewed validation studies throughout the next year.

We want to hear from you. Do you trust AI to tell you when it is unsure, or do you find its confidence misleading? Share your thoughts in the comments below or share this piece with your colleagues.

You may also like

Leave a Comment