You asked for a research summary with citations. The model gave a fluent answer ending with “Smith et al. (2019). Effects of microbreaks on cognitive performance. Journal of Occupational Psychology, 42(3), 287-301.” Looks real. You search the journal — no such paper. The DOI is fake. Or worse — the URL the model gave you points to a real domain but a path that returns 404. This is one of the most damaging hallucination failure modes because the wrong-but-plausible output is rewarded in human eyes until someone tries to verify.
Citation hallucination is structural, not random. Models trained on text that often includes citations learn that “claims should be followed by citation-shaped strings.” When the model doesn’t know a real citation, it generates a citation-shaped string anyway. You can’t prompt it away — you have to remove the situation where the model needs to invent.
Common causes
1. Prompt asks for citations from base model with no retrieval
“Summarize the research on X and cite sources.” The model has no access to a database. It generates plausible-sounding citations from training-distribution patterns. Most will be partially or fully fake.
How to spot it: Run any 5 of the model’s citations through Google Scholar. If 0-1 are real, this is the bug.
2. RAG retrieved nothing but model proceeded anyway
Your retrieval step returned no documents (empty result, or all below similarity threshold). The model received no context but still answered with confidence — and invented citations to back the confidence.
How to spot it: Log the retrieved chunks. If the RAG layer returned [] and the model still cited sources, the model fabricated them.
3. Retrieved chunks don’t contain the citation but model emits one anyway
RAG fetched 3 paragraphs about microbreaks. None contains a citation. Model still ends the answer with “(Smith, 2019)” to seem rigorous.
How to spot it: Search the retrieved chunks for the cited string. Not found means model fabricated.
4. Model conflates real authors with wrong papers
Author John Smith is real. Wrote on a different topic in 2017. Model attributes a 2019 paper on a similar-sounding topic to him. Author is real, paper isn’t.
How to spot it: Author name returns Google Scholar hits but not for the cited paper title.
5. URL hallucinations from confident pattern-matching
“Source: https://stackoverflow.com/questions/1234567/how-to-foo” — the domain is real, slug pattern is right, but the question doesn’t exist. Model generated URLs that match the format.
How to spot it: Click URLs. 404 or “no results” = fabricated.
6. DOI fabrication
Model emits https://doi.org/10.1234/jop.2019.42.3.287 — format is valid, registrar code may even be real, but DOI doesn’t resolve.
How to spot it: Paste DOI in doi.org. “DOI not found” = fake.
7. Cited author names follow generic patterns
The model defaults to common English surnames: Smith, Jones, Brown, Williams. Real research in a specialized field often has authors with very different surname distributions.
How to spot it: A bibliography heavy on generic English surnames in a non-English-language field is suspicious.
Shortest path to fix
Step 1: Don’t ask for citations from a base model without retrieval
This is the single biggest fix. If the model has no document store and you ask it to cite, expect 80%+ hallucination rate.
BAD: "Summarize research on X. Include 5 academic citations."
GOOD: "Summarize the topic of X based on your general knowledge.
Do NOT include citations or URLs. Mark anything specific
as 'based on common knowledge, please verify.'"
Step 2: Use RAG and constrain to retrieved sources
You will receive 3 document excerpts below.
Answer the user's question using ONLY these excerpts.
For every claim, cite which excerpt it came from: [1], [2], [3].
If excerpts don't cover something, say "Not in provided sources."
Never invent a citation. Never reference a source not in the excerpts.
The negative constraint is critical.
Step 3: Validate every URL and DOI programmatically
import requests
def validate_citations(text):
urls = re.findall(r'https?://[^\s)]+', text)
dois = re.findall(r'10\.\d+/[^\s)]+', text)
bad = []
for url in urls:
try:
r = requests.head(url, timeout=5, allow_redirects=True)
if r.status_code >= 400: bad.append(url)
except Exception: bad.append(url)
for doi in dois:
r = requests.head(f"https://doi.org/{doi}", timeout=5, allow_redirects=True)
if r.status_code >= 400: bad.append(doi)
return bad
Reject responses that contain unreachable citations.
Step 4: For RAG, validate citation tokens against retrieved corpus
allowed_sources = set(chunk['id'] for chunk in retrieved)
cited = extract_citation_ids(model_output)
fake = cited - allowed_sources
if fake:
raise ValueError(f"Model cited sources not in corpus: {fake}")
Step 5: Switch to a tool-using agent for real citations
Modern setups: give the model a search() or fetch_paper() tool. It calls the tool, receives real results, and quotes from real text. No tool → no citation.
tools = [{"name": "search_papers", ...}, {"name": "fetch_url", ...}]
# Model can ONLY cite what a tool returned
Step 6: Make hallucinated citations expensive in the prompt
IMPORTANT: If you fabricate a citation, the user will lose
trust in this entire system. Saying "I don't have a real source"
is always better than inventing one.
Stating consequences sometimes reduces fabrication rate on aligned models, but never relies on this alone — combine with Step 2 and Step 4.
Step 7: Surface citations as separate verifiable claims in UI
Don’t render citations as plain text. Each citation should be a clickable link that opens the source — and if the link 404s, show an error badge. Forcing verifiability in the UI is the last line of defense.
When this is not on you
Base LLMs are trained on a corpus that includes millions of citations. They internalize “claims need citations” as a pattern. Even with strong instructions, top models still hallucinate 20-40% of citations without retrieval. RAG is non-optional for any product that requires accurate sourcing.
Easy to misdiagnose as
A “model knowledge gap” — as in, “this model just doesn’t know recent papers.” It does know there are papers; it doesn’t have a way to ground specific ones. Knowledge is fine; verifiability is not.
Prevention
- Never ask base models for citations without retrieval.
- Always constrain output to retrieved corpus with explicit IDs.
- Validate every URL and DOI in post-processing; reject bad responses.
- For paper-grade output, switch to tool-using agents with
search()tools. - Surface citations as clickable links so fakes get caught at click-time.
- Log hallucination rate per model/prompt template and track over time.
FAQ
- Will saying “only cite real sources” fix this? No. The model thinks the fabricated citations are real. Instruction-only fixes have low effectiveness here.
- Are some models worse than others? Yes — smaller open-weight models hallucinate citations at much higher rates. But even top-tier models hallucinate without retrieval.
Related
- AI hallucinated facts
- AI hallucinated a file path
- Prompt lacks source hierarchy
- Prompt lacks context hierarchy
- Model fills in missing details
- AI answer too vague
- No success criteria specified
- Ambiguous evaluation criteria
- AI invented wrong API signature
- Output polished but not actionable
Tags: #Prompt engineering #Troubleshooting #llm-output #Hallucination #Citations #rag