Claude Code Output Does Not Match Your Prompt: Fix the Plan, Not the Diff

Q: How do I stop it editing files I didn't name?

Two layers: put an explicit `Out of scope` line in the prompt, and use `allowed-tools` / `disallowed-tools` in a slash-command skill to constrain what it can touch. Plan Mode also surfaces the file list before any write, so you can reject an out-of-scope plan.

The diff is polished code for the wrong feature. Catch the divergence at the plan stage with Plan Mode (Shift+Tab) and a restate-before-execute step, not after the code lands.

Published: May 17, 2026 Updated: Jun 15, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

You asked Claude Code to “fix the timezone bug in invoice rendering.” The PR comes back beautifully — well-tested, well-documented, and entirely about cache invalidation in the invoice list. It’s a real bug it found while reading the file. It’s not the bug you asked about. The timezone issue is still there.

Fastest fix: stop letting Claude Code jump straight to edits. Press Shift+Tab until the status bar shows ⏸ plan mode on, then ask your question. In Plan Mode (Claude Code v2.1.0+) Claude reads the code and presents a written plan, and it cannot touch a file until you approve. You catch the misread in the plan — the cheap stage — instead of in the diff. If you prefer text, type /plan, or launch with claude --permission-mode plan to make it the default.

Output-vs-prompt divergence almost always happens during plan formation, not during code generation. Claude reads the prompt, the file, and related context, then commits to a mental task that drifts from what you wrote. By the time the code lands, the plan is already wrong — and the code reflects the plan faithfully. So the cure is structural: force an explicit restate-and-approve gate before any edit.

Which bucket are you in?

Symptom you observed	Most likely cause	Jump to
You can read your prompt two valid ways	Ambiguous prompt, Claude picked one	Cause 1
First instruction done, a later condition ignored	Long prompt got summarized	Cause 2
Output follows a house rule you didn’t ask for	`CLAUDE.md` default won	Cause 3
Diff is in a file you never named	Intent inferred from file context	Cause 4
Fix was suspiciously small	Claude solved an easier nearby problem	Cause 5
Bug still reproduces after the “fix”	Right symptom, wrong root cause	Cause 6

Common causes

Ordered by hit rate, highest first.

1. The prompt was ambiguous and Claude picked a plausible interpretation

“Fix the timezone bug” could mean the rendering bug, the storage bug, the parsing bug, or the comparison bug. Claude saw the first one it ran into and proceeded. You meant a different one.

How to spot it: After the work, look at your original prompt with fresh eyes. If you can identify two or more valid interpretations, ambiguity was the cause.

2. Claude skimmed a long prompt and missed the key sentence

Long prompts (more than ~200 words) sometimes get compressed into “the main ask” — your secondary constraints or specific framing drop out. Claude works from the compressed version.

How to spot it: The work matches the first major instruction in your prompt but ignores conditions added later (“but make sure it doesn’t break the email path”).

3. CLAUDE.md primed a competing default

Your prompt said “do X.” CLAUDE.md says “we always do Y in this area.” Claude reconciled by doing X-in-the-style-of-Y, which isn’t what you wanted. This is common enough that it has an open tracking issue on the Claude Code repo: Claude reads an explicit CLAUDE.md rule, says it understands, then does something else (anthropics/claude-code issue #15443).

How to spot it: Output matches a CLAUDE.md convention you forgot was there, even though your prompt specified differently. Run /memory to see exactly what’s loaded into context from CLAUDE.md files.

4. Claude inferred intent from file context, not the prompt

You said “fix the bug.” Claude read invoice.ts, saw an obvious code smell, decided that was the bug, and “fixed” it. The actual bug was in a different file. The same mechanism produces out-of-scope edits — files you never named getting modified — which is a recurring complaint (anthropics/claude-code issue #41707).

How to spot it: The diff is in a file you didn’t name and didn’t expect to touch.

You asked for the harder fix. Claude saw an easier related issue, fixed that, and called it done. Common when the real fix needs deeper investigation than the model committed to.

How to spot it: The change is small and clean for a problem you were prepared to spend hours on. Suspicious.

6. The prompt described the symptom, Claude addressed a different cause

“Users see wrong timestamps” could be timezone, formatting, server clock, browser locale, or DB column type. Claude picked one cause, fixed it, but the actual cause was another.

How to spot it: The bug is still reproducible after Claude’s “fix.” The reported symptom didn’t pin down which root cause.

Shortest path to fix

Ordered by ROI. Step 1 catches the majority of misalignment at the cheap stage (planning), not the expensive stage (code).

Step 1: Gate every non-trivial task behind a plan you approve

The native tool for this is Plan Mode. Press Shift+Tab to cycle permission modes until the status bar reads ⏸ plan mode on (the cycle is normal → auto-accept edits → plan), or type /plan. Claude investigates read-only, writes out a plan, and stops. Nothing is edited until you accept the plan. Review the plan against your intent, then approve, or send corrections and have it re-plan.

Plan Mode handles the structure for you, but it does not force Claude to surface its own doubts. Add this to the prompt to make ambiguity visible:

Before proposing a plan:
1. Restate the task in 3 bullets:
   - What I'm asking you to do (one sentence)
   - What success looks like (one observable criterion)
   - What is OUT of scope (what you will NOT touch)
2. List 2 alternative interpretations of the prompt you considered and rejected.

The “alternative interpretations” line is the lever — it forces Claude to name the ambiguity instead of silently picking one reading.

Step 2: For bug fixes, demand the failing reproduction first

Before fixing:
1. Reproduce the bug. Show me the exact input that triggers it + the exact wrong output.
2. Trace the failure to the responsible code with file:line.
3. Propose the fix.
4. Wait for my approval before writing code.

This catches “fixed the wrong bug” because you see the reproduction before any fix. If the reproduction doesn’t match what you see, the divergence is already exposed.

Step 3: For multi-step tasks, get a status update after each step

After each step (file changed / test added / migration applied):
- Paste a one-line summary
- Stop and wait for "proceed"
- Do not start the next step until I confirm

You can interrupt mid-task instead of finding misalignment at the end. Press Esc to interrupt the moment a summary looks wrong; the partial work stays so you can redirect from there.

Step 4: When divergence happens, diagnose at the plan, not the diff

If the output doesn’t match intent, don’t just fix the code — figure out where the misread happened:

Your output is correct code for X. I asked for Y.

Walk me through:
1. What did you understand the task to be?
2. What in my prompt led to that understanding?
3. What signal would have led you to interpret Y instead?

The answers tell you which of the six causes hit you, and they improve your next prompt.

Step 5: Use a structured prompt template for non-trivial work

Goal: [one sentence]
Acceptance: [one observable criterion]
Scope: [files allowed]
Out of scope: [files/concerns NOT to touch]
Constraints: [must / must not]
Context: [domain info Claude can't infer]

A six-line template eliminates most ambiguity-driven misreads. The Out of scope line is the one that prevents Cause 4.

Step 6: For repeat divergence, turn the template into a slash command

If similar prompts keep going wrong (“fix bug” → fixes wrong bug), the pattern is the prompt category, not the individual prompt. Save the structured layout as a reusable command so you stop re-typing it.

As of June 2026, custom commands have been merged into skills. Both forms still work and both create a slash command:

Legacy: a Markdown file at .claude/commands/fix-bug.md → /fix-bug
Current (recommended): .claude/skills/fix-bug/SKILL.md → /fix-bug

A project skill lives in .claude/skills/ and is shared with the repo; a personal one in ~/.claude/skills/ follows you across projects. Whatever the user types after the command lands in $ARGUMENTS. Example SKILL.md:

---
description: Reproduce, locate, then propose a fix before writing any code
argument-hint: [bug description]
allowed-tools: Read, Grep, Glob, Bash
---
Bug to fix: $ARGUMENTS

Before editing any file:
1. Restate the bug in one sentence and state what is OUT of scope.
2. Reproduce it: exact input -> exact wrong output.
3. Trace to the responsible code (file:line).
4. Propose the fix and WAIT for my approval.

Now /fix-bug timezone in invoice rendering runs the whole gate every time. Build one per category — /fix-bug, /add-feature, /refactor — each calibrated for that kind of work.

How to confirm it’s fixed

Re-run the original failing reproduction. The symptom you reported must be gone — not a related one.
Open the diff and confirm only the files you scoped were touched. An edit outside Scope means Cause 4 slipped through; reject and re-plan.
Read Claude’s plan summary back against your one-line Acceptance criterion. If they don’t describe the same outcome, the plan was wrong and the code is wrong with it.

Prevention

Keep Plan Mode on for non-trivial work (Shift+Tab to ⏸ plan mode on); approve a plan before any edit.
For bug fixes, require the failing reproduction before the fix proposal.
Turn per-category templates into /fix-bug, /add-feature, /refactor skills so the gate is one keystroke.
Get per-step status updates so divergence is interruptible with Esc.
When divergence happens, diagnose the plan (not the diff) and update the category skill.
Keep CLAUDE.md short and check it with /memory; a stale convention there quietly overrides your prompt.
A five-line specific prompt beats a fifty-line ambiguous one. Specificity beats length.

FAQ

Does Plan Mode cost extra tokens? It adds a planning round, but it reads code without editing, so it is far cheaper than letting Claude write the wrong code, reverting it, and trying again — the revert-and-retry loop is what burns tokens.

My prompt was clear and it still went sideways. Now what? Run the Step 4 diagnostic. Ask what it understood the task to be and what in your prompt led there. Usually it points at a CLAUDE.md default (Cause 3) or context it picked up from an open file (Cause 4), not your wording.

Will switching to Claude Opus 4.7 instead of Sonnet 4.6 fix this? Sometimes for genuinely hard reasoning, but misalignment is mostly a planning-gate problem, not a model-capability problem. A weaker model behind a restate-and-approve gate beats a stronger model running unsupervised.

How do I stop it editing files I didn’t name? Two layers: put an explicit Out of scope line in the prompt, and use allowed-tools / disallowed-tools in a slash-command skill to constrain what it can touch. Plan Mode also surfaces the file list before any write, so you can reject an out-of-scope plan.

I approved the plan but the code still drifted from it. That’s a different failure than this article covers — the plan was right, the execution wasn’t. Use Step 3 (per-step status) so you catch the drift at the first deviating step rather than at the end.

External references: Claude Code Plan Mode (official docs), Slash commands and skills (official docs).

Tags: #Troubleshooting #Claude Code #Debug #Output mismatch

Which bucket are you in?

Common causes

1. The prompt was ambiguous and Claude picked a plausible interpretation

2. Claude skimmed a long prompt and missed the key sentence

3. CLAUDE.md primed a competing default

4. Claude inferred intent from file context, not the prompt

5. Claude solved a related, easier problem

6. The prompt described the symptom, Claude addressed a different cause

Shortest path to fix

Step 1: Gate every non-trivial task behind a plan you approve

Step 2: For bug fixes, demand the failing reproduction first

Step 3: For multi-step tasks, get a status update after each step

Step 4: When divergence happens, diagnose at the plan, not the diff

Step 5: Use a structured prompt template for non-trivial work

Step 6: For repeat divergence, turn the template into a slash command

How to confirm it’s fixed

Prevention

FAQ

Related

Related Articles

Claude Code Bash Sandbox Blocks an Expected Command

Claude Code Not Loading Your Project CLAUDE.md (2026 Fix)

Claude Code Hook Blocks Edit Unexpectedly

Claude Code MCP Call Times Out (or Hangs) Repeatedly

Claude Code Output Truncated by Context Window

Claude Code Session Resume Loses Memory of Prior Work