ChatGPT Deep Research Task Fails

You triggered Deep Research, 10 minutes later still nothing — quota, vague prompt, or scraping block.

Deep Research is ChatGPT’s background research agent: give it one prompt and it spends 5–30 minutes browsing dozens of web pages / PDFs to produce a cited report. Its failure modes look different from a normal chat — often nothing is broken, you just get a report that “looks complete but is full of generic statements,” or the agent silently gives up mid-run.

Triage in order: quota first, prompt second, target sites third — the hit-rate gap between these three is large.

Common causes

In rough order of frequency:

1. Monthly quota exhausted — silently downgraded to GPT-5.5

Deep Research has a per-month cap by plan tier: a few runs/month on free, ~10 on Plus / Team, much higher on Pro / Enterprise. Past the cap, the “Deep Research” button is still there but clicking it just runs a normal model — and you get back an answer with no real browsing trail.

How to verify: check whether the response has a “Sources” block at the bottom. Deep Research always has one; plain GPT-5.5 doesn’t. Or open avatar → Settings → Subscription to see Deep Research credits.

2. Prompt too broad — agent gives up

Internally, Deep Research first does a “task decomposition”: it splits your prompt into 5–10 sub-queries. If your prompt has no decomposable dimensions (e.g. “research AI for me”), the decomposer outputs heavily overlapping sub-tasks, and the agent stops after a few rounds because “marginal information gain is too low.”

How to verify: look at the Sources count at the bottom of the report. Fewer than 8, mostly from one domain (all Wikipedia / Medium), means the agent quit early.

3. Target sites block via robots.txt or Cloudflare

OpenAI’s OAI-SearchBot / ChatGPT-User user agents are blocked by robots.txt or Cloudflare firewalls on some sites (Twitter/X, LinkedIn, Quora, Substack paywalls, internal corporate networks, some news sites). Deep Research doesn’t stop when it can’t reach a page — those sources just go missing from the report, biasing the conclusion toward the sources it could reach.

How to verify: if a site you’d expect to appear (e.g. a leading industry blog) doesn’t show up in Sources even once, its server is likely rejecting the crawler.

4. Output language mismatched with prompt language

If your prompt is in Chinese but you want an English report (or vice versa), the decomposed English sub-queries pull back Chinese sources, dropping citation quality. Deep Research handles cross-language research poorly.

How to verify: the Chinese/English source ratio doesn’t match what you expected.

5. Too many input URLs / files in one task

If your prompt pastes 10+ URLs or 3+ PDFs, the agent may sample rather than process all of them.

How to verify: the report’s citations don’t cover every URL you provided.

Shortest path to fix

Ordered by time-to-test — 30-second checks first.

Step 1: Confirm Deep Research is actually running

Avatar → Settings → Account → check this month’s Deep Research credits. If it’s 0, every “Deep Research” click today is being downgraded — either wait for the next month or upgrade the plan.

You can also tell from the response:

SignalReal Deep ResearchDowngraded to GPT-5.5
Top progress bar (“Thinking…Browsing…”)Yes, 3–5 stagesNo
Total runtime5–30 minutes< 1 minute
Sources block at bottomAlways present, 10–40 entriesNone
Report lengthUsually 1500–4000 wordsUsually < 800

Step 2: Rewrite the prompt to be decomposable + bounded

Bad → good prompt rewrite template:

Bad:  "Research AI video generation tools for me"
Good: "Compare Sora, Veo 3, and Kling across these 4 dimensions:
       1) max single-shot duration
       2) character consistency across shots
       3) commercial-use licensing
       4) public pricing
       Output as a table with source URL + publication date per row."

The three required elements:

  1. Named entities: specific products / companies / papers / time windows, not abstractions like “the AI industry”
  2. Explicit dimensions: list the columns you want to compare
  3. Output format: table / report with H2s / Markdown list

Step 3: Change “make it search” to “make it read these sources”

If you already know the authoritative sources (official docs, whitepapers, specific blogs), paste the URLs into the prompt:

Write a comparison report based on these 5 URLs:
- https://...
- https://...
(Max 5 — more than that and it will sample.)
You may add up to 5 supplementary external sources but do not replace the above.

This sidesteps “can’t reach site X” entirely.

Step 4: Split the task into multiple runs

If one prompt spans 3 unrelated sub-questions (e.g. “market size” + “technical principles” + “business models”), run it 3 times, one sub-question each, and merge by hand. Deep Research is much better at going deep on one thing than spreading across three.

Step 5: Try a different language

If a Chinese run pulls only low-quality Chinese sources, translate the same prompt to English and rerun. The English source pool is much larger and usually higher quality. Then have GPT-5.5 translate the English report back into Chinese.

Step 6: Export + double-check citations

Deep Research occasionally “hallucinates citations” — the URL is real but the sentence it claims came from there isn’t actually there. For anything you’ll publish, manually click 3 random Sources entries and verify the quote.

Prevention

  • Use Deep Research when you already know the direction and need evidence — not when you’re not sure what you want. For the latter, scope it out in a normal chat first.
  • Plan your monthly quota: list the 5–10 topics you actually need Deep Research for and rank them. Don’t burn runs on trivia.
  • Anything you’ll publish: hand-verify at least 3 citations. The model can’t guarantee this for you.
  • Build a small library of 3–5 prompt templates (market comparison, technical due diligence, competitor research). New topics in a known template beat starting from scratch.

Tags: #ChatGPT #Troubleshooting