Table of Contents
Used ChatGPT, Claude, or Gemini for anything that mattered? You have hit a hallucination, whether or not you caught it. The model tells you a book exists when it doesn't. It cites a court case that never happened. It puts a quote in someone's mouth that they never said. In 2023 a New York lawyer was sanctioned for filing a brief with six fabricated case citations that ChatGPT produced and then doubled down on when asked if they were real (Mata v. Avianca, S.D.N.Y.). The pattern has repeated in courts on three continents since.
The argument of this page is narrow and load-bearing: hallucination is not a defect on top of a working system, it is the system working as designed. Once you see why, you stop treating "is this true?" as the model's job and start treating it as yours.
Newer models hallucinate less, and that is the trap
Frontier models in 2025 and 2026 fabricate noticeably less than the 2023 generation, mostly because they were trained to say "I'm not sure" more often and because search-grounded modes retrieve real text instead of recalling it. None are at zero, and the improvement is itself a hazard: a model that is right almost all of the time trains you to stop checking, which is exactly when the occasional miss costs you. The fabrications also got more fluent. Early models produced obviously broken citations; current ones produce citations with plausible volume numbers, real-sounding judges, and correct formatting that happen to point at nothing.
Why it happens: the mechanism under the chat box
One fact explains all of it. A language model has no separate step where it asks "is this true?" Truth is not a variable it tracks.
Ask "what was the title of Stephen King's first novel?" and nothing gets looked up. No database query fires. The model converts the question into numbers, runs them through billions of weights, and emits the most probable next word, then the next, one token at a time. Usually the most probable answer is also the correct one, because the phrase "Stephen King's first novel" sits next to "Carrie" across millions of training documents. The statistics and the facts happen to agree. The model did not know that. It got lucky in a way that is reliable on common questions and unreliable on rare ones.
Now ask something thin in the training data: a minor person's birthday, a specific clause in an obscure contract, the citation for a niche ruling. The probability mass spreads out. No single continuation dominates, so the model picks the most plausible-sounding one and commits to it with the identical fluency it uses for "Carrie." There is no internal signal that says "this region of the answer was a guess." The confidence is uniform because the generation process is uniform. That uniformity is the whole problem: the model cannot flag the parts it made up, because it does not represent the difference.
The mental model that sticks: a colleague who has read every book in a library and can never walk back to a shelf to check. On famous topics they are flawless. On an obscure footnote they produce a fluent, well-formed, completely invented answer in the same calm voice, in good faith, with no tell. Retrieval-based tools change this by making the model read a real document before answering, which is why they hallucinate less. They do not fix the predictor; they just hand it the shelf.
Four ways an answer goes wrong, four different fixes
Calling everything a "hallucination" hides the fix. Four failure types look identical from the outside and each one has a different countermeasure. Misdiagnose the type and you apply the wrong fix.
- Hallucination. The model invented a fact that was never in its training data in usable form. Fix: ground it. Provide the source text, or use a tool that retrieves real documents before answering.
- Stale knowledge. The fact existed but after the training cutoff. The model is not guessing, it is reporting an old world correctly. Fix: turn on web search or paste the current information in.
- Reasoning error. The model had the right facts and the right task but slipped a step in the chain (an arithmetic carry, a logic inversion). Fix: ask it to work step by step, or hand the step to a calculator or code tool.
- Bias or unsafe answer. The output reflects a skew in the training data rather than a missing fact. Fix: this is on the vendor; you can rephrase, push back, or pick a model with stronger alignment work behind it.
The tell that separates them: a hallucination survives rephrasing the question (the gap is in the data), a reasoning error usually does not (the gap is in the steps). Spot the type and the fix picks itself.
Who this actually bites, and how hard
The cost of a hallucination scales with how irreversible the action it feeds is.
- Lawyers, doctors, financial advisors. A fabricated citation or dosage is an unrecoverable error with a name attached. Cite from the primary source you opened yourself, never from the model's rendering of it. This is the Mata v. Avianca lesson, and it has a per-incident price.
- Journalists and writers. Quotes, dates, titles, and attributions are the exact category models fake most cleanly, because well-formed fakes are statistically easy. Treat every model-supplied quote as unverified until you find it at the source.
- Researchers and analysts. A confident wrong number that flows into a deck is worse than no number, because it looks done. Verify before it propagates.
- Teachers and students. A model is a strong tutor and a poor authority. Use it to explain, not to settle.
How to catch one before it costs you
Three habits, strongest first.
- Demand a specific, checkable source, then check it. A vague gesture at "the literature" or "studies" is not a source. A paper title, a page, a URL, a case number is. And open the ones it gives you, because the citation itself is a favorite hallucination target. A real-looking link to a page that does not say what the model claimed is the most common failure in practice.
- Search the exact string. Paste the suspicious quote or fact verbatim into a search engine. A real quote has a trail. A fabricated one returns nothing, or returns only the model's own output echoed back. Silence is the signal.
- For anything consequential, make the model read before it answers. Retrieval-grounded chat, RAG, or an MCP server pointed at your real documents forces the answer to come from text the model just looked at, not from memory it is reconstructing. This does not eliminate hallucination, it shrinks the surface, and it makes the remaining errors auditable because you can check the cited passage.
Where this leaves you
A model that hallucinates is not broken, and the fix is not a better model. The fix is a workflow that assumes the predictor will sometimes guess and puts a verification step on the outputs that carry consequences. Treat AI as a fast first draft and a sharp thinking partner. Do not treat it as the last step before an irreversible action. That single boundary is the entire skill.
For the layer underneath this, our explainer on how AI actually works walks the next-word mechanism in more depth. For the tools that shrink the hallucination surface by grounding answers in real data, see our MCP roundup.
One low-friction way to try grounded answers is an AI agent that reads your own documents before it responds. Lindy is among the easier ones to stand up for that, and pulling from real data is exactly the move that cuts the confident guessing.