AI chatbots fail to separate truth from belief, Stanford researchers warn

New Delhi: A new peer-reviewed study has raised questions about how much artificial intelligence models really understand the difference between what is true and what someone merely believes. Researchers at Stanford University found that large language models (LLMs) often fail to separate factual knowledge from personal belief, and even more worryingly, struggle to recognize when a belief is false.

The findings, published in Nature Machine Intelligence, show that despite all the progress in AI, models like GPT-4o and DeepSeek R1 are still missing a key ingredient of human reasoning — the ability to know when something is factually wrong.

What the study found

The research team led by James Zou, associate professor at Stanford University, tested 24 popular large language models across 13,000 questions. They wanted to see how well these models could tell the difference between facts, beliefs, and false statements.

The results were revealing. “Those models, which included GPT-4o, which was released in May 2024, were 34.3 percent less likely to identify a false first-person belief compared to a true first-person belief,” the study said. Models released before May 2024 did even worse, being 38.6 percent less likely to recognize false beliefs.

When it came to identifying true or false facts, the newer models performed much better, scoring above 91 percent accuracy, while older ones averaged between 71 and 85 percent. But when it came to understanding belief, they stumbled. GPT-4o dropped from 98.2 percent accuracy to 64.4 percent when dealing with false first-person beliefs, and DeepSeek R1 fell from over 90 percent to just 14.4 percent.

Why this matters

The researchers warned that this gap could have serious real-world consequences as AI systems become more common in medicine, law, journalism, and science. The paper states, “Failure to make such distinctions can mislead diagnoses, distort judicial judgments and amplify misinformation.”

The team created a benchmark called KaBLE, covering 13 epistemic tasks, to measure how well models deal with the concept of knowledge and belief. They found that models handled third-person beliefs better than first-person ones, which means chatbots find it easier to understand what others think than what they themselves claim.

The study explains that this shows an “attribution bias,” where models fail to correctly interpret self-referential statements. The authors also noted that even though newer systems are better at handling complex reasoning, “they still rely on inconsistent reasoning strategies, suggesting superficial pattern matching rather than robust epistemic understanding.”

The missing human touch

One of the paper’s most striking lines reads, “The ability to discern between fact, belief and knowledge serves as a cornerstone of human cognition.” It explains how humans intuitively understand the difference between “I believe it will rain tomorrow” and “I know the Earth orbits the Sun.” AI systems, on the other hand, do not yet have this intuition.

This difference might seem small, but it defines whether an AI chatbot can safely assist a doctor diagnosing a patient, a journalist verifying information, or a lawyer preparing a case. The authors wrote that “most models lack a robust understanding of the factive nature of knowledge, that knowledge inherently requires truth.”

What it means for users

The study concludes that LLMs must improve before being trusted in “high-stakes domains.” Until then, users need to be cautious about treating chatbot answers as absolute truth. As more AI systems are integrated into daily life, these gaps in reasoning could become harder to ignore.

In short, chatbots are getting better at speaking like humans, but not necessarily at thinking like them.