AI Hallucinated Facts: Detect and Stop Confident Wrong Answers

The model invented a citation, API method, or number with full confidence. Here is how to spot it and force grounded, verifiable answers on the first try.

Published: May 17, 2026 Updated: Jun 17, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

You asked for a specific paper, a function signature, or last quarter’s revenue number, and the model returned a confident answer that was wrong. The citation does not resolve, the function does not exist in the SDK, or the number is off by an order of magnitude. When you pushed back, it doubled down with a more elaborate fake citation. Hallucination is not random noise. It concentrates in the gap between “needs a fact” and “has no source.”

Fastest fix: paste the source into the prompt and add the rule If the answer is not in the text above, write "NOT IN SOURCE". That removes the space the model invents from. If you cannot paste a source, turn on web search (ChatGPT, Claude, and Gemini all ground answers in real-time results, which cuts hallucination roughly 40-85% but does not zero it out) and verify every named entity yourself.

Which bucket are you in?

Symptom	Likely cause	Go to
Niche/specific fact, no tool or pasted source	Guessing from training memory	Step 1, Step 2
Suspiciously round “complete” list (exactly 10, exactly 7)	Completion padding	Cause 2, Step 5
Version/price/model name is out of date	Training cutoff vs recent change	Step 2, Cause 3
Answer never admits uncertainty	No permission to say “I don’t know”	Step 3
Source pasted, but answer contradicts it	Long-context dilution	Step 1, Cause 5
Fake citation gets worse when you say “are you sure?”	Pushback confabulation	Step 6

Common causes

Ordered by frequency.

1. Asking for a fact without supplying or enabling retrieval

If the prompt is What is the time complexity of Postgres GIN index lookups? and the model has no web tool, no pasted doc, and no grounding, its only option is to guess from training memory. The guess is often close but not exactly right, and the model will not say “I do not have the doc.” It produces a number that sounds plausible.

How to spot it: your prompt names a specific entity (paper, function, number, date) and you have given the model no way to look it up.

2. Asking for a “complete” or “comprehensive” list

Prompts like list all the React hooks or give me every Cloudflare Workers binding type trigger completion behavior. The model writes 10 real items and 2 invented ones so the list looks exhaustive. Few real lists in training are exactly N items, so it pads.

How to spot it: output has suspiciously round counts (exactly 10, exactly 7), or the last items look generic.

3. Training cutoff vs. recent change

You ask about a library version, API change, or pricing tier that shipped after the model’s cutoff. It answers using the pre-cutoff state and presents it as current. As of June 2026 this bites hard on fast-moving facts: the OpenAI Python library is on openai 2.43.0 (released June 17, 2026) and has moved to the Responses API, yet many models still emit the old openai==1.x chat-completions pattern.

How to spot it: the topic changed in the last 6-12 months (versions, prices, model names, deprecations).

4. No instruction to flag uncertainty

Default behavior in chat models is to produce a fluent answer, not to admit ignorance. Without explicit permission such as If you are not sure, say "I don't have enough information", the model fills gaps with plausible strings. Anthropic lists “allow Claude to say I don’t know” as its first recommended hallucination guardrail.

How to spot it: the answer never contains “I am not sure,” “I cannot verify,” or “based on training data which may be outdated.”

5. Long context dilutes grounding

Even when you paste the source, if the prompt is long and the source is near the top, the model can invent claims that contradict the source by the time it generates the later sections. Attention across a long context is not uniform.

How to spot it: the source is in the prompt, but the answer contains claims not in the source.

6. Pushback triggers confabulation

When you say are you sure?, the model often invents a stronger-sounding citation instead of backing down. Post-training rewards confident-sounding answers.

How to spot it: the second answer has a more specific (but still fake) citation than the first.

Before you change anything

Confirm whether the hallucination is reproducible or one-off; run the same prompt twice (Anthropic calls this best-of-N verification).
Write down the exact prompt, model, and any system prompt or grounding source.
Save the wrong output verbatim so you can compare against a corrected pass.
Note whether tool use (web search, code execution, file retrieval) was enabled.
Check whether the topic is time-sensitive (versions, pricing, news).

Shortest fix path

Ordered by ROI.

Step 1: Paste the source and lock the model to it

Replace What does the Stripe API return on a failed charge? with the doc text plus a constrained instruction:

Source (Stripe docs, copied below):
<paste relevant section>

Question: What does Stripe return on a failed charge?
Rules:
- Use ONLY the text above. Do not use general knowledge.
- If the answer is not in the text, write "NOT IN SOURCE".
- Quote the exact line you used.

The “use only the text above” line is Anthropic’s “external knowledge restriction” technique, and the quote rule is its “verify with citations” technique. For long documents (over ~20k tokens), first ask the model to extract verbatim quotes, then answer using only those quotes. That two-step shape grounds the answer and fixes Cause 5.

Step 2: Enable retrieval / tool use when you cannot paste a source

ChatGPT: if the Search tool is missing, go to Settings > Personalization > Advanced and toggle Web search on. In a chat, open the tools menu (the + / “View all tools”) and pick Search, or type / and choose Search. ChatGPT also auto-searches when a question looks time-sensitive.
Claude: turn on web search (it runs inside the reasoning loop and returns inline citations), or attach the file / use a connector so the answer is grounded in your document.
Gemini / API: use Grounding with Google Search; the response includes groundingMetadata with the queries, web results, and citations.
API (OpenAI): enable a search/web tool in the Responses API; for plain factual recall without a tool, treat the answer as a guess.

Grounding helps a lot but is not a cure. Vendor and benchmark figures put the reduction around 40% (Gemini grounding) up to 73-86% (web search) versus static inference, so you still verify the synthesized claims.

Step 3: Add an explicit uncertainty token

Append to every fact-heavy prompt:

For each non-trivial claim, append a confidence tag:
- [VERIFIED] if you can quote a source in this conversation.
- [LIKELY] if it matches your training but you cannot cite.
- [UNCERTAIN] if you are guessing.
Refuse to produce [VERIFIED] without a quoted source.

This converts silent hallucination into visible uncertainty.

Step 4: For code, demand runnable artifacts and pin the version

Replace how do I use the OpenAI Python SDK to stream a response with:

Write a runnable Python script using the current OpenAI SDK (openai 2.x, Responses API).
Use client.responses.create(..., stream=True) and iterate the streamed events.
Include the exact import line.
At the end, list every method, class, and parameter you used, and mark each as
VERIFIED-FROM-DOCS or UNCERTAIN.

Then run it. A crash on AttributeError or ImportError means the model invented a method. Pinning the version (e.g. openai 2.43.0) stops it from regenerating a deprecated 1.x pattern.

Step 5: Cross-check every named entity

For every paper title, function name, URL, or number in the output, verify against a primary source: official docs, the paper page, the GitHub repo, the company changelog. Treat the model’s confidence as zero signal. For “complete list” answers (Cause 2), check the count against the real source rather than trusting the round number.

Step 6: Do not argue. Restart with grounding.

If the model gives a fake citation and you say are you sure?, you will likely get a more elaborate fake. Instead, start a fresh turn:

I cannot find X. Paste the exact URL or section number where you read this.
If you cannot, mark it UNCERTAIN and stop.

How to confirm it’s fixed

Every non-trivial claim in the output has a quotable source or an UNCERTAIN tag.
Running the code throws no AttributeError, ImportError, or 404s.
A second model or a search engine corroborates the facts.
Re-running the same prompt twice produces the same factual claims (best-of-N agreement).

If it’s still broken

Switch to a retrieval/grounded model: ChatGPT with web search on, Claude with web search, Gemini with Grounding, or Perplexity.
Reduce scope: break “tell me everything about X” into 5 small factual questions, each independently verifiable.
For domain-specific facts (legal, medical, financial), do not use a general chat model; use a domain tool that cites primary sources.
If the topic post-dates the cutoff, accept that any unverified answer is a guess and require an external lookup.

FAQ

Why does the AI invent a worse citation when I ask “are you sure?” Pushback reads as a request for a more confident answer, not a request to recheck. The model produces a more specific (still fake) citation. Start a new turn and demand the exact URL or section number instead of arguing (Step 6).

Does turning on web search stop hallucination completely? No. Grounding cuts it substantially (roughly 40% for Gemini grounding, up to ~73-86% with web search per 2026 benchmarks), but the model still synthesizes across sources and can misread or over-generalize. Always open the cited links for high-stakes facts.

Why did the model give me old library code that no longer runs? Training cutoff (Cause 3). For example it may emit the deprecated openai==1.x chat-completions pattern when the current library is openai 2.43.0 on the Responses API. Pin the exact version in the prompt and run the code (Step 4).

The model pasted my document but still made up a fact in the answer. Why? Long-context dilution (Cause 5). For documents over ~20k tokens, first ask it to extract verbatim quotes, then answer using only those quotes, with Use ONLY the text above (Step 1).

Which models hallucinate least in 2026? 2026 factual-accuracy benchmarks put the Claude family at the low end of hallucination rates, but no model is reliable on niche or post-cutoff facts without retrieval. Model choice matters less than grounding and verification.

How do I make answers auditable for a team? Require a quoted source per claim, then a self-check pass: the model finds a supporting quote for each claim and retracts any it cannot support (Anthropic’s “verify with citations” pattern). Keep the confidence tags from Step 3.

Prevention

Keep a default suffix: Tag each non-trivial claim [VERIFIED] / [LIKELY] / [UNCERTAIN].
For code prompts, always pin the library and version.
Use grounded models for fact-heavy work; use chat models for reasoning and writing.
Never accept a model’s pushback citation without independent verification.
Build the habit of running the code, resolving the URL, and opening the cited page before trusting any output.

External references: Anthropic — Reduce hallucinations and Gemini API — Grounding with Google Search.

Tags: #Troubleshooting #Prompt #Prompt quality #Hallucination