Does this catch AI-fabricated sources?

Yes — Pass 1 is the fabrication catch. Most hallucinations fail at existence; a minority invent URLs that resolve to unrelated real pages or chimera DOIs, which the metadata check and Pass 2 catch.

How long does it take?

About 5 minutes per citation the first time, dropping to roughly 1 minute once you have the templates and batch by domain.

Which models are best?

Existence and recency: a search-grounded model (Perplexity, Gemini 3.1 Pro, GPT-5.5 with search). Accuracy: a model with browsing that actually fetches pages. Provenance: any reasoning model with the source pasted in. For academic refs, add a Crossref DOI check.

What about paywalled sources?

Mark them "real but unverifiable" in the ledger and verify via abstract, preprint, or institutional summary. Do not ship a claim whose primary source you cannot read at all.

Can I automate this in a script?

The existence pass automates cleanly — DOI resolution and Retraction Watch checks are a single Crossref API call. Accuracy and provenance need a human in the loop; the failure modes are too subtle to fully delegate.

What if I cited the same source 10 times?

Run accuracy once per source-passage pair, not once per source. A source you trust for claim A may not support claim B.

AI Tool Tutorials

How to Check AI Citations and Sources: A 4-Pass Verification Workflow

A reproducible 4-pass workflow to verify every citation in an AI-assisted draft — catch fabricated sources, mismatched quotes, and chimera references before publish.

Published: May 17, 2026 Updated: Jun 04, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

In May 2026, a Lancet-published study found that 1 in 277 academic papers in early 2026 contained at least one fabricated reference — up from 1 in 2,828 in 2023, a tenfold jump in three years. The legal world tracks the same rot: Damien Charlotin’s public database has logged more than 1,227 court filings with AI-hallucinated citations, growing by five to six cases per day, with sanctions that now reach $86,000. The cause is structural. The GhostCite benchmark (Feb 2026) ran 375,000 citations through 13 frontier models across 40 domains and measured hallucination rates between 14.23% and 94.93%. Even with web search turned on, 3–13% of cited URLs were still fabricated.

This is a four-pass workflow that catches the three ways an AI-assisted citation lies: the source doesn’t exist, the source exists but says something else, or the source is real but the wrong layer of evidence. Budget about 5 minutes per citation the first run, dropping to roughly 1 minute once you have the prompts saved.

TL;DR

AI cites confidently and wrongly: hallucination rates run 14–95% depending on the model and domain (GhostCite, Feb 2026), and 3–13% of URLs stay fake even with search on.
Run four passes per citation: existence (does it resolve?), accuracy (does it support the claim?), provenance (primary vs secondary?), recency (still current?).
Use a search-grounded model (Perplexity, Gemini 3.1 Pro, GPT-5.5 with search) for existence and recency; a browsing model for accuracy; any reasoning model for provenance.
For academic sources, verify the DOI against Crossref (150M+ records) and check Retraction Watch — both are in one Crossref API call.
Fail 2 of 4 passes → re-research, don’t patch. Ship a provenance ledger alongside the draft.

Who this is for and when to use it

Editors, researchers, students, content marketers, and policy analysts shipping work with citations they did not personally read. If you wrote every citation by reading the source, skip this. If an AI drafted any part with citations, or a co-author handed you a deck and said “trust me,” run the passes.

Reach for it before publishing AI-assisted research, journal-club notes, blog posts, white papers, or briefs that lean on cited claims — and before forwarding any memo that quotes sources you cannot personally vouch for. One fabricated citation in a published piece costs more than the entire audit.

Before you start

Get your draft in plain text or Markdown. PDFs hide citations in footnotes and break on paste.
Have two tools ready: a chat model (Claude Opus 4.7, GPT-5.5) for parsing, and a search-grounded model (Perplexity, Gemini 3.1 Pro, GPT-5.5 with search) for live URL checks. As of June 2026, Perplexity is Free / Pro $20 per month / Max $200; a single Pro seat covers most editorial verification work.
Set your bar up front. “All citations must be primary unless explicitly noted” is much stricter than “all citations must exist.” Pick one and hold it for the whole draft.
Block 30–90 minutes the first time. Later runs on similar pieces are far faster.

The four passes

1. Build the ledger

Paste the full draft into a chat model and prompt: “Extract every citation into a table with columns: claim (one sentence), source name, URL or DOI, page or section, claim type (statistic, quote, definition, attribution).” Save this table — it becomes your working sheet and, after cleaning, your published provenance ledger.

Chunk long drafts by section. Above roughly 8,000 words, extraction accuracy degrades and the model starts skipping citations near the end.

2. Pass 1 — existence

Paste the citation list into a search-grounded model: “For each row, confirm the URL resolves and the source exists. Flag any that 404, redirect to a homepage, or appear fabricated. Cite your search result for each verdict.”

For academic sources, this is where the DOI matters. A real DOI that resolves but whose metadata (title, authors, year) doesn’t match the citation is a chimera reference — a citation stitched together from parts of different papers, and a classic LLM failure. Resolve the DOI through Crossref (over 150 million records) and confirm every field. Crossref also surfaces Retraction Watch status in the same API response, so you catch retracted sources in the same step.

3. Pass 2 — accuracy

For each source that exists: “Fetch the source and quote the exact passage that supports the claim. If no such passage exists, say so explicitly and quote the closest thing you found. Do not paraphrase — quote, max 30 words.” Compare the quoted passage to your draft’s claim word by word.

This is the pass most people skip and the one that fails most often. The URL is real, but it never said what the draft claims. If the model paraphrases instead of quoting, it did not actually fetch the page — re-run with browsing explicitly enabled.

4. Pass 3 — provenance

“Is this source primary (original data, study, or first-hand account), secondary (analysis of primary sources), or tertiary (a summary of secondary)? If not primary, identify the primary source it ultimately points to.” A secondary source that mangled its primary is the trap here; weak provenance becomes a re-research candidate.

5. Pass 4 — recency

“Has this claim been updated, contradicted, retracted, or superseded since the source was published? Search for newer work by the same authors or on the same topic in the last 12 months.” Statistics rot fastest: a “67% of X” figure from 2019 is often 51% by 2026, and the piece reads as dated or wrong.

6. Score and decide

Any citation failing 2 of 4 passes: re-research or remove. Do not patch. Patching usually means swapping in a slightly less weak source from the same search and shipping anyway. Save the cleaned list with verdicts as the provenance ledger and link it from the article footer.

Which model for which pass

Pass	Best tool (June 2026)	Why
Existence	Perplexity, Gemini 3.1 Pro, GPT-5.5 (search on)	Live web grounding with per-claim citations
Existence (academic)	Crossref API + Retraction Watch	150M+ DOIs; catches chimera and retracted refs
Accuracy	Claude (web tool), GPT-5.5 (browsing), Gemini 3.1 Pro	Must fetch and quote the actual page
Provenance	Any reasoning model (Opus 4.7, GPT-5.5)	Source contents pasted in; no live fetch needed
Recency	Perplexity, Gemini 3.1 Pro	Search for newer or retracting publications

Prompt templates

Pass 1 (existence): "Here are 12 citations from a draft. For each,
confirm the URL or DOI resolves to a real source and the source
actually exists. For DOIs, confirm title/authors/year match.
Return: row | resolves Y/N | metadata match Y/N | one-line evidence.
Do not guess. If unsure, say UNSURE and explain."

Pass 2 (accuracy): "For each cited source below, fetch the page and
quote the exact passage that supports the claim. If no such passage
exists at the URL, return NO MATCH and quote the closest thing you
found. Maximum 30 words per quote. Do not paraphrase."

Common mistakes

Trusting that “the AI wrote a citation, so the citation exists.” Models invent plausible papers constantly — that is exactly what Pass 1 catches.
Skipping the accuracy pass. The URL is real but does not support the claim; this is the single most common failure and the easiest to ship past unnoticed.
Treating a secondary citation as primary because the secondary source said so. The secondary may have mangled the primary.
Patching a failed citation with a slightly weaker one from the same search. If the strongest source you can find is weak, your underlying claim is wrong, not just your sourcing.
Skipping the DOI metadata check. A resolving DOI is not a verified citation — chimera references resolve fine but cite the wrong paper.
Pasting a 20,000-word draft into one prompt. Chunk by section; long contexts degrade extraction accuracy.

FAQ

Does this catch AI-fabricated sources?: Yes — Pass 1 is the fabrication catch. Most hallucinations fail at existence; a minority invent URLs that resolve to unrelated real pages or chimera DOIs, which the metadata check and Pass 2 catch.
How long does it take?: About 5 minutes per citation the first time, dropping to roughly 1 minute once you have the templates and batch by domain.
Which models are best?: Existence and recency: a search-grounded model (Perplexity, Gemini 3.1 Pro, GPT-5.5 with search). Accuracy: a model with browsing that actually fetches pages. Provenance: any reasoning model with the source pasted in. For academic refs, add a Crossref DOI check.
What about paywalled sources?: Mark them “real but unverifiable” in the ledger and verify via abstract, preprint, or institutional summary. Do not ship a claim whose primary source you cannot read at all.
Can I automate this in a script?: The existence pass automates cleanly — DOI resolution and Retraction Watch checks are a single Crossref API call. Accuracy and provenance need a human in the loop; the failure modes are too subtle to fully delegate.
What if I cited the same source 10 times?: Run accuracy once per source-passage pair, not once per source. A source you trust for claim A may not support claim B.

Tags: #Tutorial #Research #Fact check #Citations

TL;DR

Who this is for and when to use it

Before you start

The four passes

1. Build the ledger

2. Pass 1 — existence

3. Pass 2 — accuracy

4. Pass 3 — provenance

5. Pass 4 — recency

6. Score and decide

Which model for which pass

Prompt templates

Common mistakes

FAQ

Related

Related Articles

AI Competitive Research Tutorial: 5 Competitors in 30 Minutes

AI Historical Archive Research: A Primary-Sources-First Workflow

AI Market Sizing Tutorial: TAM/SAM/SOM From Top-Down + Bottom-Up

AI Systematic Literature Review Tutorial Without Hallucination

AI Fact-Check Workflow: Verify a Claim in 3 Minutes

AI Industry Research Workflow: Deep Research, End to End