Model Invented Fake Citations and URLs

The model produced citations like Smith et al. 2019, journal of XYZ — and the paper does not exist. Or it linked to a URL that 404s. Why citation hallucination happens and how to stop it.

You asked for a research summary with citations. The model gave a fluent answer ending with “Smith et al. (2019). Effects of microbreaks on cognitive performance. Journal of Occupational Psychology, 42(3), 287-301.” Looks real. You search the journal — no such paper. The DOI is fake. Or worse — the URL the model gave you points to a real domain but a path that returns 404. This is one of the most damaging hallucination failure modes because the wrong-but-plausible output is rewarded in human eyes until someone tries to verify.

Citation hallucination is structural, not random. Models trained on text that often includes citations learn that “claims should be followed by citation-shaped strings.” When the model doesn’t know a real citation, it generates a citation-shaped string anyway. You can’t prompt it away — you have to remove the situation where the model needs to invent.

Common causes

1. Prompt asks for citations from base model with no retrieval

“Summarize the research on X and cite sources.” The model has no access to a database. It generates plausible-sounding citations from training-distribution patterns. Most will be partially or fully fake.

How to spot it: Run any 5 of the model’s citations through Google Scholar. If 0-1 are real, this is the bug.

2. RAG retrieved nothing but model proceeded anyway

Your retrieval step returned no documents (empty result, or all below similarity threshold). The model received no context but still answered with confidence — and invented citations to back the confidence.

How to spot it: Log the retrieved chunks. If the RAG layer returned [] and the model still cited sources, the model fabricated them.

3. Retrieved chunks don’t contain the citation but model emits one anyway

RAG fetched 3 paragraphs about microbreaks. None contains a citation. Model still ends the answer with “(Smith, 2019)” to seem rigorous.

How to spot it: Search the retrieved chunks for the cited string. Not found means model fabricated.

4. Model conflates real authors with wrong papers

Author John Smith is real. Wrote on a different topic in 2017. Model attributes a 2019 paper on a similar-sounding topic to him. Author is real, paper isn’t.

How to spot it: Author name returns Google Scholar hits but not for the cited paper title.

5. URL hallucinations from confident pattern-matching

“Source: https://stackoverflow.com/questions/1234567/how-to-foo” — the domain is real, slug pattern is right, but the question doesn’t exist. Model generated URLs that match the format.

How to spot it: Click URLs. 404 or “no results” = fabricated.

6. DOI fabrication

Model emits https://doi.org/10.1234/jop.2019.42.3.287 — format is valid, registrar code may even be real, but DOI doesn’t resolve.

How to spot it: Paste DOI in doi.org. “DOI not found” = fake.

7. Cited author names follow generic patterns

The model defaults to common English surnames: Smith, Jones, Brown, Williams. Real research in a specialized field often has authors with very different surname distributions.

How to spot it: A bibliography heavy on generic English surnames in a non-English-language field is suspicious.

Shortest path to fix

Step 1: Don’t ask for citations from a base model without retrieval

This is the single biggest fix. If the model has no document store and you ask it to cite, expect 80%+ hallucination rate.

BAD:  "Summarize research on X. Include 5 academic citations."
GOOD: "Summarize the topic of X based on your general knowledge.
       Do NOT include citations or URLs. Mark anything specific
       as 'based on common knowledge, please verify.'"

Step 2: Use RAG and constrain to retrieved sources

You will receive 3 document excerpts below.
Answer the user's question using ONLY these excerpts.
For every claim, cite which excerpt it came from: [1], [2], [3].
If excerpts don't cover something, say "Not in provided sources."
Never invent a citation. Never reference a source not in the excerpts.

The negative constraint is critical.

Step 3: Validate every URL and DOI programmatically

import requests

def validate_citations(text):
    urls = re.findall(r'https?://[^\s)]+', text)
    dois = re.findall(r'10\.\d+/[^\s)]+', text)
    bad = []
    for url in urls:
        try:
            r = requests.head(url, timeout=5, allow_redirects=True)
            if r.status_code >= 400: bad.append(url)
        except Exception: bad.append(url)
    for doi in dois:
        r = requests.head(f"https://doi.org/{doi}", timeout=5, allow_redirects=True)
        if r.status_code >= 400: bad.append(doi)
    return bad

Reject responses that contain unreachable citations.

Step 4: For RAG, validate citation tokens against retrieved corpus

allowed_sources = set(chunk['id'] for chunk in retrieved)
cited = extract_citation_ids(model_output)
fake = cited - allowed_sources
if fake:
    raise ValueError(f"Model cited sources not in corpus: {fake}")

Step 5: Switch to a tool-using agent for real citations

Modern setups: give the model a search() or fetch_paper() tool. It calls the tool, receives real results, and quotes from real text. No tool → no citation.

tools = [{"name": "search_papers", ...}, {"name": "fetch_url", ...}]
# Model can ONLY cite what a tool returned

Step 6: Make hallucinated citations expensive in the prompt

IMPORTANT: If you fabricate a citation, the user will lose
trust in this entire system. Saying "I don't have a real source"
is always better than inventing one.

Stating consequences sometimes reduces fabrication rate on aligned models, but never relies on this alone — combine with Step 2 and Step 4.

Step 7: Surface citations as separate verifiable claims in UI

Don’t render citations as plain text. Each citation should be a clickable link that opens the source — and if the link 404s, show an error badge. Forcing verifiability in the UI is the last line of defense.

When this is not on you

Base LLMs are trained on a corpus that includes millions of citations. They internalize “claims need citations” as a pattern. Even with strong instructions, top models still hallucinate 20-40% of citations without retrieval. RAG is non-optional for any product that requires accurate sourcing.

Easy to misdiagnose as

A “model knowledge gap” — as in, “this model just doesn’t know recent papers.” It does know there are papers; it doesn’t have a way to ground specific ones. Knowledge is fine; verifiability is not.

Prevention

  • Never ask base models for citations without retrieval.
  • Always constrain output to retrieved corpus with explicit IDs.
  • Validate every URL and DOI in post-processing; reject bad responses.
  • For paper-grade output, switch to tool-using agents with search() tools.
  • Surface citations as clickable links so fakes get caught at click-time.
  • Log hallucination rate per model/prompt template and track over time.

FAQ

  • Will saying “only cite real sources” fix this? No. The model thinks the fabricated citations are real. Instruction-only fixes have low effectiveness here.
  • Are some models worse than others? Yes — smaller open-weight models hallucinate citations at much higher rates. But even top-tier models hallucinate without retrieval.

Tags: #Prompt engineering #Troubleshooting #llm-output #Hallucination #Citations #rag