Codex Agent Makes Too Many Redundant Tool Calls (Fix)

Q: Does `/plan` mode actually reduce tool calls, or just defer them?

It reduces them. In `/plan` mode Codex does its reading and analysis once, up front, and commits to a plan before touching files. That replaces the read-edit-read-edit churn (where each surprise triggers a fresh read) with one concentrated exploration pass, so the same files are not re-read mid-edit.

Q: After `/clear` my agent suddenly reads everything again. Is that normal?

Yes. `/clear` wipes the conversation, including the read history and any in-context map, so the next task starts cold. Keep your repo map in `AGENTS.md` (Step 1) so it reloads automatically, and re-check the model footer: a known bug can reset the model to `gpt-5.4` after a clear.

Codex re-reads the same file 8 times and re-greps the same query 5 times. Fastest fix: run /plan first, add an AGENTS.md repo map, and keep reasoning effort at medium. Full diagnosis below.

Published: May 23, 2026 Updated: Jun 15, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

You ask Codex Agent to add one function. The transcript shows 47 tool calls before it writes a single line. Read package.json four times. Glob "**/*.ts" three times with the same pattern. Read src/types.ts eight times, twice in a row. By the time it gets to writing code, half the turn budget is spent and the relevant context has been pushed out of the window. The output, when it finally arrives, is shaped wrong because the agent never settled into the real problem.

Fastest fix (90 seconds): run the task through Codex CLI’s /plan mode first so the agent reads, analyzes, and proposes a plan without editing; then drop a short AGENTS.md repo map at the project root so it never re-discovers structure; and keep model_reasoning_effort = "medium" (the GPT-5.5 default) rather than cranking it to xhigh, which inflates exploratory reads. That combination alone usually cuts redundant tool calls 40-70%.

Redundant tool calls are not a model defect. They are a sign that the agent is searching, not building. It has no map, no plan, no recently-read cache, so each tool call is a guess. The cure is to give the agent the map up front, require a plan before any edit, and structure the prompt so it does not re-discover what you already know.

Which bucket are you in

Symptom in the transcript	Most likely cause	Go to
First 5+ calls are reads with no edit	No repo map	Step 1
`Read → Write → Read → Write` interleaved	No plan before editing	Step 2 + `/plan`
N grep matches followed by N+ file reads	Search results not summarized	Step 3
Same `Read <path>` appears 3+ times	No “what I read” cache	Step 4
Cheap `ls` / `wc` / `head` spam, no edits	Tools used as a thinking aid	Step 5
Re-read right after a 5000-line stdout dump	Verbose output drives re-reads	Step 6

Common causes

Ordered by hit rate.

1. No structural map provided: agent re-discovers structure

You said “add a new endpoint”. Agent does not know where endpoints live. It reads package.json, tsconfig.json, scans src/, reads index.ts, then app.ts, then server.ts. Five reads to discover what one sentence could have told it.

How to spot it: First 5+ tool calls are exploration, not action. Reads have no edit afterward.

2. No plan written before action

Agent jumps into editing, hits an unknown, reads more, edits more, hits another unknown. Each unknown triggers a new read. Without a written plan, every decision becomes lookup-on-demand.

How to spot it: Tool calls interleave Read → Write → Read → Write rather than concentrated Read phase → concentrated Write phase.

3. Search results not summarized

Agent runs grep -rn "useAuth" src/ and gets 80 matches. Instead of summarizing or filtering, it reads each match’s file in full. Same query 5 minutes later because it forgot the summary.

How to spot it: After a grep with N matches, you see N+ file reads. Agent did not narrow before reading.

4. No in-context “what I have read” cache

Agent forgets it already read a file. Re-prompt 10 turns later about the same code path → re-read. No structural pointer “we already covered file X”.

How to spot it: Same Read <path> shows up 3+ times across the transcript.

5. Tool calls used as a thinking aid

For some prompts the agent prefers running ls, cat, wc as a substitute for reasoning. Like “checking” rather than committing. This balloons the turn count without progress.

How to spot it: Many cheap, single-file head, ls, wc calls with no edits between them.

6. Verbose stdout drives re-reads

A tool call returns 5000 lines. Agent cannot hold that in working memory, partial info gets summarized, then it re-reads parts of the same file to “verify”.

How to spot it: A noisy tool’s output is immediately followed by re-reading the same path it just covered.

Before you start

Get a baseline count of tool calls for a typical task: grep -c "tool_use" agent.log or similar.
Identify the agent template’s exploration phase length; long exploration = symptom.
Decide whether the task is “I do not know the repo” or “I know but I am letting Codex re-derive”. They have different fixes.

Information to collect

Full transcript of the offending run with tool call counts per type.
The first 10 tool calls; these usually expose whether the agent has a map or is searching.
The task prompt’s exact wording: vague prompts cause more searching.
Any AGENTS.md / repo-map.md / convention doc the agent could have used but did not load.

Step-by-step fix

Ordered by ROI.

Step 1: Pre-feed the map via AGENTS.md (not just the prompt)

Codex CLI auto-loads an AGENTS.md file from the project root (and nearer directories) into the first turn of every session, so the map persists across tasks instead of being retyped. Put the structural map there:

# AGENTS.md

## Repo structure
- API routes: src/api/routes/*.ts (one file per resource)
- Services: src/services/*.ts (business logic)
- Types: src/types/index.ts (shared) + src/types/<feature>.ts (feature-specific)
- Tests: alongside source as *.test.ts

## Conventions
- New endpoint = route file + service method + types entry + test
- Never read node_modules or dist; they are build artifacts

Two gotchas as of June 2026: Codex only reads up to project_doc_max_bytes from each AGENTS.md (default 32 KB), so keep the map terse; and if your file is named something else, register it under project_doc_fallback_filenames in ~/.codex/config.toml or it is ignored. See OpenAI’s Codex configuration reference for both keys.

For the specific task, still scope it at the top of the prompt:

For this task you will edit:
- src/api/routes/orders.ts (new)
- src/services/orderService.ts (extend)
- src/types/order.ts (extend)
- src/api/routes/orders.test.ts (new)

The agent now has the structural map without a single read. This drops 5-15 redundant exploration calls.

Step 2: Use `/plan` mode to force exploration-then-commit

Codex CLI ships a built-in /plan mode: the model reads files and analyzes the codebase, then proposes an implementation plan and writes nothing until you approve. That is exactly the structured exploration phase you want, and it stops the Read → Write → Read → Write interleave that drives re-reads. Type /plan (or toggle plan mode with Tab in the composer), review the plan, then exit to execute.

If you are scripting or want a plan inside a normal turn, ask for it explicitly:

Plan first. Output:

PLAN:
1. <step>
2. <step>
...

For each step, list the files involved. Confirm the plan matches the working scope. Only then begin editing.

If you find the plan needs revision mid-execution, output a REVISED PLAN before continuing.

Plan mode has its own reasoning budget: plan_mode_reasoning_effort (values none | minimal | low | medium | high | xhigh) lets you think harder during planning without paying that cost on every execution turn. Use /plan for tasks that touch three or more modules; skip it for one-line edits where the planning overhead is not worth it.

Step 3: Demand summaries after every multi-result search

After any grep / glob / search with > 5 results, output a SUMMARY:

SUMMARY of "<query>":
- N matches across M files
- Relevant files: <list 1-5>
- Skipping: <list>

Then read only the relevant files. Do NOT read all matches.

A grep summary plus 3 reads beats 80 reads from raw matches.

Step 4: Maintain an explicit read-tracker in the prompt

After each Read, append to your scratchpad:

READ_TRACKER:
- src/types/order.ts (read at step 2)
- src/services/orderService.ts (read at step 3)

Before any Read, check the tracker. If the file is listed, recall from memory; do not re-read unless contents changed.

This externalizes the “what I have already read” cache. The agent now has to acknowledge re-reads explicitly. For very long sessions, run /compact to summarize earlier turns into a smaller note before the read history scrolls out of the window; that is usually cheaper than letting the agent re-read what fell off the back.

Step 5: Turn off tools the task does not need

In ~/.codex/config.toml (or a per-project .codex/config.toml), disable capabilities that invite stray calls:

# Don't let the agent reach for the web on a closed-repo task
web_search = "disabled"   # values: disabled | cached | live

# Don't spawn sub-agents unless the task genuinely needs them;
# spawn_agent / wait_agent each add tool-call overhead
[features]
multi_agent = false

Then drive the phases with /plan (read-only) followed by execution, instead of relying on the model to police itself mid-turn. Removing a tool removes the “let me just check one more thing” temptation entirely. You can also set these per task with an inline override, e.g. codex -c web_search='"disabled"'.

Step 6: Cap noisy tool output

Wrap commands that flood:

pnpm test 2>&1 | tee /tmp/test.log | grep -E "FAIL|PASS|Tests:" | head -50

Less noise → less re-read-to-verify. Many redundant reads chase missed signal in a wall of stdout.

Step 7: Tune the model and reasoning effort, in that order

Reasoning effort matters more than most people expect. model_reasoning_effort takes minimal | low | medium | high | xhigh. GPT-5.5 defaults to medium, which is the right balance for agentic coding. Counter-intuitively, pushing it to xhigh often adds exploratory tool calls because the model second-guesses itself; for tight, well-scoped tasks medium (or even low) finishes faster with fewer reads.

# ~/.codex/config.toml
model = "gpt-5.5"
model_reasoning_effort = "medium"

On the model itself: GPT-5.5 is OpenAI’s recommended starting point for Codex and reaches the same result with fewer reasoning tokens than the older GPT-5.4, which compounds in tool-heavy, multi-step runs. The Codex-tuned variants (gpt-5.2-codex family) are built for agentic coding if you want to go further. Known bug to watch as of June 2026: typing /clear in the interactive CLI can drop you back to gpt-5.4 instead of the model in your config.toml (openai/codex #19451). Re-check the model footer after a clear.

Verify

Count tool calls in the next run; should drop 40-70% with steps 1-3 alone.
Check the transcript: clear Read phase → Plan output → Write phase, not interleaved chaos.
No repeated Read <same path> across the run.
Total turns to completion drops; finishes within turn budget with headroom.

Long-term prevention

Keep AGENTS.md at the repo root with the canonical “where does X live” conventions; Codex auto-loads it every session instead of re-discovering structure (keep it under project_doc_max_bytes).
Make /plan the default for any multi-file change; one-line edits can skip it.
Pin model = "gpt-5.5" and model_reasoning_effort = "medium" in config.toml so a stray xhigh profile doesn’t inflate reads.
Disable tools the repo never needs (web_search = "disabled", features.multi_agent = false) by default.
Wrap noisy tools with output caps as the default; the agent never sees 5k lines of stdout.
Keep an agent-runs.log and review weekly: any run with more than 60 tool calls gets root-caused.

Common pitfalls

Treating “more tool calls = more thorough” as good. Each call costs context and turns; redundancy is pure cost.
Writing “do not re-read files” in the prompt without giving the agent a tracker mechanism. The instruction has nowhere to ground itself.
Letting the agent run find . -type f on a large repo. The output alone wrecks the budget.
Setting turn budget to 200 to “absorb” redundancy. The redundancy still drops output quality even if it fits.
Forgetting that prompt cache makes the redundancy cheaper in dollars but still costly in window.

FAQ

Q: My agent reads the same file twice in adjacent turns. Why?

Likely your runner does not surface a previous-tool-call cache, and the agent’s plan does not reference the read. Add the explicit READ_TRACKER scratchpad and the issue goes away.

Q: How do I count tool calls programmatically?

Most agent runners emit a JSON event stream. Count records of type tool_use. Also: grep -c "function_calls" transcript.txt for plain-text logs.

Q: Is there a fixed ratio of “reads to edits” that is healthy?

Rough rule: 2-4 reads per edit on familiar code, 5-8 on unfamiliar. If you are seeing 15+ reads per edit, the prompt is missing a structural map.

Q: Can I just lower max-turns to force efficiency?

Lowering max-turns punishes legitimate work too. Better to fix the cause (no map, no plan): efficiency rises and headroom grows at the same time.

Q: Does /plan mode actually reduce tool calls, or just defer them?

It reduces them. In /plan mode Codex does its reading and analysis once, up front, and commits to a plan before touching files. That replaces the read-edit-read-edit churn (where each surprise triggers a fresh read) with one concentrated exploration pass, so the same files are not re-read mid-edit.

Q: After /clear my agent suddenly reads everything again. Is that normal?

Yes. /clear wipes the conversation, including the read history and any in-context map, so the next task starts cold. Keep your repo map in AGENTS.md (Step 1) so it reloads automatically, and re-check the model footer: a known bug can reset the model to gpt-5.4 after a clear.

Tags: #Codex #agent #Troubleshooting #tool-calls #efficiency