Why does CSV work but JSON does not?

CSV almost always triggers the sandbox — flat, tabular, obvious. JSON is ambiguous; small JSON especially often lands in the read-as-context path because it fits in the prompt window.

Can I make a Custom GPT always use the sandbox?

Check **Code Interpreter & Data Analysis** under Configure → Capabilities, and write the instructions to require code for any analytical question. Even then, the model occasionally skips it on edge cases, so keep the explicit "run the code, show it" phrasing in your prompts.

How do I know it actually ran Python and didn't fake it?

Expand the **Analyzed** pill in the reply and read the code. If there is no pill and no code block, it estimated. See [OpenAI's data-analysis guide](https://help.openai.com/en/articles/8437071-data-analysis-with-chatgpt) for what the running state looks like.

My nested JSON still won't aggregate after `json_normalize` — why?

Lists nested inside records do not flatten into scalar columns; `json_normalize` turns them into list-valued cells. Ask the model to `record_path`-target the inner list, or explode it with `df.explode('items')` before grouping.

Does the Free tier get the sandbox?

Free (GPT-5.5, as of June 2026) can run data analysis but with tight rate limits; heavy or repeated runs may be throttled. Plus ($20/mo) and above have far more headroom.

Should I just convert to CSV instead?

For flat data, yes — exporting to CSV locally (with `jq` or pandas) removes all routing ambiguity and is the most reliable upload. Reserve JSON uploads for genuinely nested data.

Troubleshooting

ChatGPT Treats Uploaded JSON as Plain Text Instead of Structured Data

ChatGPT read your JSON as a wall of text and string-matched an answer instead of running pandas. Here's how to force the code sandbox and get real aggregates.

Published: May 24, 2026 Updated: Jun 15, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

You upload orders.json with 5,000 records and ask “what is the median order value for customers in California?” Instead of writing pandas code, ChatGPT eyeballs the first few thousand characters and replies “based on a sample of the data, the median appears to be around $80.” That answer was vibes, not math. The model treated the JSON as prose rather than triggering the Python sandbox to actually parse and aggregate. The result looks confident, contains a number, and is wrong.

Fastest fix: re-ask with an explicit code instruction — Load orders.json into a pandas DataFrame and compute the median of total grouped by state. Show me the code. When the wording names pandas and asks for the code, ChatGPT almost always runs the sandbox instead of guessing. Everything below is for when that is not enough.

The root cause is that ChatGPT decides, per turn, whether to run your file through its Python sandbox or just read it as text context. The sandbox feature is now labeled Code Interpreter & Data Analysis (you will still see the older names “Advanced Data Analysis” and “Code Interpreter” in older docs — same tool). JSON often gets routed to “read as context” — especially when the file is small enough to fit in the prompt, or when your question reads like a summarization request. As of June 2026 the default model is GPT-5.5, and the routing heuristics still shift between model and picker mode (Instant / Thinking / Pro), so a prompt that triggered code last week may not this week. This article explains how to force the right path every time.

Which bucket are you in

Symptom	Likely cause	Jump to
No “Analyzing” step, no expandable code block	Model never ran Python	Cause 1, 2
Reply hedges with “this dataset has varying structure”	Irregular / nested schema	Cause 3
`json.load()` parse error in the code it did run	NDJSON with a `.json` extension	Cause 4
File was uploaded at Project / GPT setup, not in the message	Vector-embedded Knowledge, not sandbox	Cause 5
Voice mode, or a Custom GPT with no code capability	Sandbox not available in this surface	Cause 6
Record count comes back suspiciously round and wrong	Inlined file truncated at the token limit	Cause 7

Common causes

Ordered by hit rate, highest first.

1. File fits in context, model never invokes the sandbox

If the JSON is under roughly 200 KB, the model often inlines it into the prompt and reasons over it as text. No Python runs. Aggregates like median, percentile, group-by are eyeball estimates.

How to spot it: ChatGPT does not show an “Analyzing” step or an expandable code block. The reply lacks numerical precision. Asking “show the Python you used” gets “I estimated from the data” instead of actual code. When code does run, you see a collapsed “Analyzed” pill you can expand to read the pandas it executed — its absence is the tell.

2. Question is phrased as a summary, not a calculation

“Tell me about this data” or “what are the main patterns” reads as summarization. The model picks the cheap path — read top-of-file and describe. Quantitative phrasing triggers code; qualitative phrasing does not.

How to spot it: Reworded as “compute the median in Python and show the code” triggers the sandbox; the original “what is typical” did not.

3. JSON has irregular shape, model gives up on structure

Records have inconsistent keys, deeply nested with optional fields, or a mix of arrays and objects at the root. The model recognizes it cannot trivially map this to a table and falls back to text reading.

How to spot it: Schema inspection of your file reveals optional / nested / inconsistent fields. ChatGPT replies hedge with “this dataset has varying structure” then make qualitative claims.

4. NDJSON / JSON Lines uploaded with `.json` extension

Your file is one JSON object per line (NDJSON), but the extension is .json. json.load() fails. Code Interpreter, if invoked, raises a parse error and the model recovers by reading the raw bytes as text.

How to spot it: First character is { and the second line also starts with { — that is NDJSON, not JSON. json.load() on it fails; pd.read_json(..., lines=True) works.

5. JSON is in a Project / Custom GPT knowledge file

Knowledge files in Projects and Custom GPTs are indexed for retrieval (file_search) — chunked, embedded, and searched by semantic similarity, not loaded into a DataFrame. Aggregate questions return retrieved fragments, never a summed or grouped total. There is a partial exception: if Code Interpreter & Data Analysis is enabled on that GPT, the model can sometimes copy a knowledge file into the sandbox and run code on it, but this is unreliable for large data files and the model often still defaults to retrieval. The robust path is to upload as a per-message attachment instead (Cause/Fix Step 5).

How to spot it: The JSON was uploaded once at Project or GPT setup (Configure → Knowledge, capped at 20 files / 512 MB each). It does not appear as a per-message attachment in the conversation. Aggregate questions return retrieved snippets, not totals.

6. The surface has no code sandbox enabled

Voice mode, certain mobile flows, and any Custom GPT that did not check Code Interpreter & Data Analysis ship without the Python sandbox. Numerical answers on data files in those surfaces are always estimates.

How to spot it: In a Custom GPT, open Configure → Capabilities; if the Code Interpreter & Data Analysis box is unchecked, no code can run. In a normal chat, the sandbox is available on Plus / Pro / Team / Enterprise — if you are on the Free tier with tight limits, complex analysis can be rate-limited or unavailable.

7. Token budget hit, file truncated silently

Large JSON inlined into prompt: only the first N tokens make it through. Records past the cutoff do not exist to the model. Aggregates are computed on the visible portion only.

How to spot it: Ask “how many records are in this file?” If the answer is suspiciously round (like “approximately 1,000”) and your real count is different, truncation happened.

Shortest path to fix

Step 1: Ask for Python explicitly

Replace “what is the median order value” with:

Load orders.json into a pandas DataFrame using the code sandbox. Compute the median of the total field grouped by state. Show me the code and the resulting table.

This phrasing forces the tool-call. The model rarely refuses an explicit “run pandas and show the code” request when the sandbox is available. Naming pandas and asking to see the code is what flips the routing — a bare “calculate the median” often does not.

Step 2: If it is NDJSON, say so

This file is NDJSON (one JSON object per line). Load it with
pd.read_json('orders.json', lines=True).

Hand-feeding the parse line removes the parse-failure branch entirely.

Step 3: Validate row count first

Always ask:

First, print len(df) and df.dtypes. Confirm you read all rows before answering.

A row count mismatch versus your expectation surfaces truncation, NDJSON misreads, or duplicate parses.

Step 4: For nested JSON, flatten before aggregation

import json, pandas as pd
with open('orders.json') as f:
    data = json.load(f)
df = pd.json_normalize(data, sep='_')

json_normalize collapses nested keys into flat columns. After that, group-by and aggregates work like any CSV.

Step 5: For Project / Custom GPT knowledge, switch to per-message upload

If you need aggregate answers, do not put the JSON in Knowledge. Upload it as a per-message attachment in the chat where Code Interpreter is active. Knowledge is for retrieval, not arithmetic.

Step 6: Reserve Knowledge for reference text, not data

Schema guidelines, taxonomies, brand copy — those belong in Project Knowledge. Tabular data that needs counted, filtered, summed — those belong as per-chat attachments.

Step 7: Verify the answer with a known total

Pick a value you can cross-check (the count of records, the sum of a small column, a specific known record). Ask ChatGPT to compute it and compare. If it matches, trust the aggregates. If it does not, dig into which step diverged.

How to confirm it’s fixed

You have a real, code-backed answer (not an estimate) when all three are true:

The reply shows an Analyzed pill you can expand to read the pandas it ran — not just prose.
len(df) printed earlier matches the record count you expect from the source file.
A known cross-check value (a count or a single record’s field) comes back exactly right.

If any one fails, the model is still string-matching. Re-issue the Step 1 prompt and add Do not estimate — run the code and print the result.

When this is not on you

The decision whether to run the sandbox is opaque. OpenAI tunes the trigger heuristics and they shift between models and picker modes (Instant / Thinking / Pro). A prompt that triggered code last month may not this month. The only stable defense is to ask for code explicitly.

Vector-embedded Knowledge stores are also fundamentally not arithmetic engines. If the product surface routes your file to retrieval, you cannot reliably retro-add aggregation through prompting — move the file to a per-message attachment instead.

Easy to misdiagnose as

“ChatGPT is bad at math” — it is not when Python runs. It is bad at math when it is reading prose.
“The file is too big” — files under a few MB are well within the sandbox’s reach. The issue is invocation, not size.
“The JSON is malformed” — usually not. Validate with jq . orders.json > /dev/null first; if jq exits 0, the file is fine.
“Custom GPT instructions are not working” — instructions cannot force tool calls reliably; the model still routes based on the question’s wording.

Prevention

Default to “compute X with Code Interpreter and show the code” phrasing for any quantitative question on a file.
Keep a one-line preflight: “Print row count, column names, and dtypes before answering.”
Never put aggregable data in Project Knowledge. Knowledge is a search index, not a database.
Use .ndjson extension for line-delimited JSON so the format is unambiguous.
For complex schemas, pre-flatten to CSV locally with jq or pandas; upload the flat CSV. Less for the model to figure out.

FAQ

Why does CSV work but JSON does not? CSV almost always triggers the sandbox — flat, tabular, obvious. JSON is ambiguous; small JSON especially often lands in the read-as-context path because it fits in the prompt window.
Can I make a Custom GPT always use the sandbox? Check Code Interpreter & Data Analysis under Configure → Capabilities, and write the instructions to require code for any analytical question. Even then, the model occasionally skips it on edge cases, so keep the explicit “run the code, show it” phrasing in your prompts.
How do I know it actually ran Python and didn’t fake it? Expand the Analyzed pill in the reply and read the code. If there is no pill and no code block, it estimated. See OpenAI’s data-analysis guide for what the running state looks like.
My nested JSON still won’t aggregate after json_normalize — why? Lists nested inside records do not flatten into scalar columns; json_normalize turns them into list-valued cells. Ask the model to record_path-target the inner list, or explode it with df.explode('items') before grouping.
Does the Free tier get the sandbox? Free (GPT-5.5, as of June 2026) can run data analysis but with tight rate limits; heavy or repeated runs may be throttled. Plus ($20/mo) and above have far more headroom.
Should I just convert to CSV instead? For flat data, yes — exporting to CSV locally (with jq or pandas) removes all routing ambiguity and is the most reliable upload. Reserve JSON uploads for genuinely nested data.

Tags: #ChatGPT #ChatGPT files #Troubleshooting #Debug #json #Data analysis #code-interpreter