“Analysis is too shallow” is almost never a model-capability problem — it’s a prompt-depth problem. The details are in context (or partially are), but ChatGPT defaults to a “safe and broad summary” because that’s what flatters most “help me look at this file” requests. To get evidence, numbers, and direct quotes, you have to explicitly demand structure, constrain output format, and force citations.
Common causes
Ordered by hit rate, highest first.
1. Prompt uses fuzzy verbs (“analyze,” “summarize,” “take a look”)
The most common failure. “Help me analyze this report” reads as “give me a summary” to the model, so you get 5-7 generic bullets. Swap to “list every expense item > $1M with page numbers” and the output is unrecognizable.
How to spot it: Read your prompt out loud. Could it apply to a different file? If yes, it’s too fuzzy.
2. Retrieval pulled only top-level chunks; deep details never reached context
For large files (> 30-page PDF, > 5MB doc), retrieval pulls top-k chunks per query. Top-k skews toward title, abstract, and TOC — high-density passages. Body details often get dropped.
How to spot it: Ask it to “quote page 47, paragraph 2 verbatim.” If it can’t, or quotes the wrong page, retrieval never reached there.
3. Output format unconstrained — model defaults to safe bullet list
Without a structure spec, the model picks the format least likely to be wrong: 5-7 bullets, 1-2 sentences each. Give it a table or section template and it has to fill specific cells.
How to spot it: Are your last two answers both generic bullet lists? If yes, no structure was specified.
4. Negative prompts (“don’t be vague”) barely work
Models follow “do Y” much more reliably than “don’t do X.” “Don’t be shallow” is much weaker than “every claim must include a verbatim quote + page number.”
How to spot it: Count “don’t / avoid / not allowed” vs “do / must / output” in your prompt. More negatives → rewrite as positives.
5. Model defaults to “short and safe” when uncertain
Poor extraction (scanned-PDF OCR, flattened tables) leaves the model uncertain about lots of content, so it hedges with vague language. Looks shallow, actually means “I couldn’t read it.”
How to spot it: Lots of “may / typically / usually / in most cases” in the answer — it’s padding with general knowledge instead of reading your file.
6. Reasoning model doesn’t fix a fuzzy prompt
GPT-5 / o3 synthesize better, but a fuzzy prompt + reasoning just gives you a prettier shallow answer. Depth comes from prompt structure, not from parameter count.
How to spot it: Same prompt produces similarly-shaped answers across 4o and o3 — the bottleneck is the prompt, not the model.
Before you start
- Confirm whether this happens in Projects, a Custom GPT, or a plain chat — retrieval behavior differs across the three.
- Duplicate the chat before retesting so history doesn’t pollute the next diagnostic.
- Confirm your plan: Free / Plus / Team / Enterprise differ in context window and available models.
Info to collect
- File type, size (MB), pages / rows; whether it’s a scanned PDF; whether it has non-ASCII / formulas / tables.
- Upload route: dragged into the chat, Project Knowledge, or Custom GPT Knowledge.
- Full prompt text + ChatGPT reply screenshot; highlight the two or three sentences you consider “shallow.”
- Current model (GPT-5.5 / GPT-5 / o3).
- One concrete example: page X contains data Y that the answer ignored.
Shortest fix path
Ordered by ROI. The first two solve ~70% of cases.
Step 1: Replace fuzzy verbs with “list / extract / compare / quote”
“Analyze this report” → concrete action:
List the 5 largest cost items in this Q3 report. For each, give:
- The exact figure (with currency)
- The page number where it appears
- A one-line quote from the surrounding paragraph
Do not generalize. If you cannot find 5, return fewer.
The difference is large. “List / extract” triggers the retrieval-and-citation route; “analyze” triggers the summarize-and-smooth route.
Step 2: Require quote + page per evidence point
Universal template:
For every claim you make:
- Quote the supporting sentence verbatim, in quotes.
- Cite the page number or section heading.
- If you cannot quote, say "no direct support in document" instead
of inferring.
This forces the model out of summarization mode into citation mode. It gets slower, but shallowness disappears.
Step 3: Give an output structure template
Don’t let it freestyle. Provide a table or Markdown skeleton:
Output as a table:
| Risk | Likelihood (low/med/high) | Quote from document | Page |
Then below the table, list:
- 2 assumptions the document makes that you would challenge
- 1 missing analysis the document should have done
Empty cells force specific content. Shallowness can’t hide.
Step 4: Ask “what did you skip”
After the first answer:
What did you skip? List every section / table / figure you didn't
look at, and why. If you skipped anything because it didn't fit a
keyword, list those keywords.
Models usually admit gaps when prompted, and you can target follow-ups at exactly those gaps.
Step 5: Chunk by section + aggregate yourself
For large docs, don’t ask one whole-doc question. First:
List every section heading in this document with page ranges.
Then ask section by section and aggregate the answers yourself. “Get full deep analysis in one shot” is an anti-pattern — depth-per-turn has a ceiling.
Step 6: If extraction is bad, convert to Markdown first
For scanned PDFs / complex layouts, convert locally:
pip install marker-pdf
marker_single report.pdf ./out --max_pages 200
Upload the Markdown — retrieval quality jumps, and the shallowness problem often disappears with it.
How to confirm the fix
- Open a fresh chat, upload the same file, run your rewritten prompt — confirm the output reliably has concrete quotes + page numbers.
- Pick one quote from the answer and Ctrl+F in the PDF — if it’s not there, the model fabricated it.
- Have a colleague run the same prompt — confirms depth comes from prompt structure, not your session state.
If still broken
- Cut the file to the minimum: keep only the 5 pages with data, ask the same question.
- Swap format: PDF → Markdown, xlsx → csv, to rule out an extraction-layer issue.
- Switch model: 4o → o3 / GPT-5; reasoning models synthesize more deeply on structured prompts.
- Package source file + prompt + model screenshot, file a ticket at help.openai.com.
Prevention
- Build a personal “deep analysis template”: list / extract / compare / quote / challenge — fixed 5 sections.
- Always end prompts with “Quote with page numbers; no inference without quote.”
- Rewrite every negative instruction as a positive one.
- For recurring file shapes, build a Custom GPT with depth requirements baked into Instructions.
- Don’t just re-prompt a shallow answer — first ask “what did you skip.”
Related reading
- ChatGPT large document incomplete analysis
- ChatGPT uploaded PDF not analyzed correctly
- ChatGPT misreads your CSV / Excel data file
- ChatGPT Projects
- ChatGPT file analysis
- ChatGPT Projects advanced workflow
- ChatGPT Custom GPT Files Not Being Used
Tags: #ChatGPT #ChatGPT files #Troubleshooting #Debug #Shallow analysis