You upload a workbook full of =VLOOKUP(...), =SUMIFS(...), and pivot-table-driven cells, ask ChatGPT to summarize the totals, and the answer is plainly wrong — totals come back as 0, or it reports the cell value literally as =SUM(B2:B100). The cause is the Excel reader Code Interpreter uses: openpyxl by default returns the formula string, not the cached evaluated value. The fix is either to flip openpyxl to read cached values, save the workbook as values-only before upload, or paste the computed numbers separately.
Common causes
1. openpyxl defaults to formula text, not values
openpyxl.load_workbook(file) without data_only=True returns the formula. With data_only=True it returns the last cached value Excel wrote — but only if Excel actually opened and saved the file recently. Files generated by libraries (xlsxwriter, exceljs) often have no cached values at all.
How to spot it: Ask ChatGPT to print a known formula cell. If output is =SUMIFS(...) instead of a number, openpyxl is in formula mode.
2. Workbook created by code, never opened in Excel
Pandas to_excel, xlsxwriter, openpyxl write — these write the formula text. Excel only computes formulas and caches values when you open and save the file in actual Excel. A file that was created by a Python script and never touched by Excel has formula strings with no cached values.
How to spot it: openpyxl.load_workbook(..., data_only=True) returns None for the cell — meaning no cached value exists.
3. External references and live data connections
Formulas like =INDIRECT("[other.xlsx]Sheet1!A1"), =GETPIVOTDATA(...), or anything pulling from Power Query / data model do not have meaningful values when the source workbook isn’t also uploaded. Even with data_only=True, you get the cache from the last time Excel could reach the source — often stale or #REF!.
4. Pivot tables are computed by Excel, not stored
Pivot tables live as cached blobs that only Excel knows how to render. openpyxl sees the source data and a stub, not the pivot output. ChatGPT cannot read a pivot table — it can only read the underlying range and rebuild the aggregation in pandas.
5. Volatile functions never get cached reliably
NOW(), TODAY(), RAND(), OFFSET(), INDIRECT() — Excel recomputes these on every open. Their cached value, if present, is whatever was last persisted. Trust it only if you know when the workbook was last opened in Excel.
Shortest path to fix
Step 1: Force openpyxl to use cached values
Tell ChatGPT explicitly:
import openpyxl
wb = openpyxl.load_workbook("file.xlsx", data_only=True)
ws = wb.active
for row in ws.iter_rows(values_only=True):
print(row)
If the file was last saved by Excel, this returns numbers. If you see None where formulas should be, the file has no cached values — go to Step 2.
Step 2: Save as values-only before upload
In Excel: select all - Copy - Paste Special - Values - save as a new .xlsx. Now every formula cell is a static number. Upload the values-only file. This is the most reliable fix and takes 30 seconds.
Alternatively, in Excel use File - Save As - choose .csv — CSV stores only computed values by definition.
Step 3: Rebuild the calculation in pandas
If you cannot re-save the file, ask ChatGPT to read the source ranges and recompute:
import pandas as pd
df = pd.read_excel("file.xlsx", sheet_name="Data")
# Re-derive the VLOOKUP / SUMIFS in pandas
totals = df.groupby("region")["amount"].sum()
print(totals)
You lose the original Excel formula audit trail but you gain a correct number. Bonus: the result is reproducible.
Step 4: For pivot tables, read the source and re-aggregate
import pandas as pd
df = pd.read_excel("file.xlsx", sheet_name="RawData")
pivot = df.pivot_table(
index="category",
columns="quarter",
values="revenue",
aggfunc="sum",
)
print(pivot)
Faster and more flexible than wrestling with the cached pivot blob.
Step 5: Paste critical computed values separately
When all else fails, open the workbook in Excel, copy the final 5-20 numbers you actually need, and paste them into chat as a markdown table. ChatGPT then has authoritative values without parsing any Excel internals.
How to confirm the fix
After re-saving or rebuilding, verify the read:
import openpyxl
wb = openpyxl.load_workbook("file.xlsx", data_only=True)
ws = wb["Summary"]
for row in ws.iter_rows(min_row=1, max_row=10, values_only=True):
print(row)
Every formula cell should now show a number, not None or a = string. If still None, the workbook never went through Excel — fall back to Step 3 (rebuild in pandas).
Prevention
- Always save Excel files in Excel (not via a script) before uploading — this caches every formula value.
- For data destined for ChatGPT analysis, prefer “Paste Special - Values” copies. Keep the formula version as a separate audit file.
- Avoid external references and live data connections in any workbook you plan to share with an LLM.
- For repeated workflows, build the aggregation in pandas / SQL from the start — Excel is then just a viewer, and ChatGPT reads the raw data directly.
- If the file must keep formulas, attach a small text file alongside listing the expected totals so ChatGPT can sanity-check its own reads.
Related
- ChatGPT data file analysis wrong
- ChatGPT CSV column misread
- ChatGPT spreadsheet too large truncated
- ChatGPT file analysis too shallow
- ChatGPT uploaded PDF not analyzed correctly
- ChatGPT large document incomplete analysis
Tags: #ChatGPT #ChatGPT files #Troubleshooting #Debug #Excel