Claude Code Execution Prompts: Plan, Build, Verify

Q: What if the plan looks wrong after Claude has already edited?

Run `/rewind` (or Esc twice on an empty prompt) and restore the conversation, the code, or both to any earlier checkpoint. Checkpoints persist across sessions and are auto-cleaned after 30 days. Remember the bash-command caveat above.

12 copy-ready prompts to brief Claude Code (or Codex) on real engineering work — scoped features, surgical bug fixes, migrations, refactors, TDD, perf, CI debugging — mapped to Plan Mode and /rewind.

Published: May 17, 2026 Updated: Jun 05, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

Claude Code performs best when a prompt gives it four things upfront: a scoped task, the files to read first, what “good” looks like, and what NOT to touch. Drop any of the four and you get over-eager refactors, surprise schema changes, and PRs that need to be unwound. The 12 prompts below carry those four parts and add explicit approval gates where the cost of being wrong is high.

They also lean on two Claude Code features that exist for exactly this reason. Plan Mode (press Shift+Tab twice, or type /plan, or launch with claude --permission-mode plan) is read-only: Claude reads your repo and drafts a plan but cannot edit files or run commands until you approve. Checkpoints let you undo: every prompt you send creates one, and /rewind (or pressing Esc twice on an empty input) restores the code, the conversation, or both. Several prompts here say “show the plan first” precisely so you spend the cheap checkpoint before any code is written. (As of June 2026, Claude Code runs Anthropic models only — Sonnet 4.6 as the default workhorse, Opus 4.7 for the hardest agentic tasks; pick with the /model command. Codex users can adapt the same prompt shapes.)

Pair these with code review prompts before you merge.

TL;DR

Every prompt names files to read, files it may change, files it must NOT touch, and a definition of done. That four-part shape is the whole game.
High-stakes prompts (migrations, refactors, dependency bumps) demand a plan or commit map first, then wait for approval — spend the checkpoint before the diff.
Run risky work in Plan Mode so Claude can’t write until you say go; keep /rewind ready for when it does.
Default to Sonnet 4.6; switch to Opus 4.7 (/model) only for multi-file refactors and gnarly debugging where reasoning depth pays for itself.

Best for

Feature implementation against an existing codebase
Surgical bug fixes with minimum diff
Multi-step migrations with rollback
3-phase refactors that stay deployable
TDD-style implementation and perf work
Spike code to validate an approach before committing

1. Scoped feature implementation

Implement {feature} in {repo}.

Files to read first: {list}
Files you may change: {list}
Files you MUST NOT change: {list}
What good looks like:
- tests pass
- no schema migration
- new code follows the existing patterns in {file}
- diff is reviewable in <300 lines

First show a plan as bullet steps. Wait for my approval before editing.

Run this in Plan Mode (Shift+Tab twice) so Claude physically cannot edit until you approve the plan.

2. Surgical bug fix

Fix bug: {description}.
Repro steps: {steps}.
Expected vs actual: {what should happen vs what happens}.

Constraints:
- minimum-diff fix only, no refactoring
- add 1 regression test that fails before the fix
- files likely involved: {list}

Show the diff before applying it.

3. Migration with rollback path

Migrate {from} → {to} in {area}.

Constraints:
- each commit must be deployable on its own
- must support both old and new during the transition window
- rollback = reverting commit by commit, no manual cleanup

Show me the commit plan first: title + diff scope of each commit. Wait for approval before executing.

For migrations the commit map is the safety net: if a later commit goes wrong, you revert it cleanly instead of using /rewind, which does not undo file changes made by bash commands (mv, rm, generated files).

4. Refactor in 3 phases

Refactor {component}. Three phases, each mergeable independently:

Phase 1: introduce new abstraction alongside old (no caller changes)
Phase 2: migrate callers one at a time, with the old still available
Phase 3: delete old code, no compat shim

Show me the Phase 1 diff only first. Do not touch callers yet.

5. New module from spec

Build module {name} per this spec: {paste}.

Codebase conventions to respect: {1 paragraph — folder layout, naming, error handling, test style}.
Existing similar module to mirror: {file}.

Show me the file structure + public interfaces (no implementation) before writing the bodies. I'll approve, then you implement.

6. Test-driven implementation

Implement {feature} test-first.

1. Write failing tests for these behaviors: {list}
2. Show me the tests. Wait for approval.
3. Implement the minimum code to make tests pass.
4. Refactor only after all tests are green.

Do not add tests for behaviors I didn't list.

7. Spike to validate an approach

I want to validate the approach for {feature}. Build a 50-LOC spike that demonstrates the core idea.

- Don't worry about edge cases or error paths
- Hard-code config where you'd normally read it
- Annotate with `// TODO: real impl needs to handle X` at every shortcut

Goal: I want to read the spike and decide if the approach is right before we commit.

8. Performance fix with measurement

Performance issue: {description}. Suspected cause: {hypothesis or "unknown"}.

Steps:
1. Add a benchmark that reproduces the slow path
2. Run baseline; report the number
3. Iterate on the fix; after each iteration, report benchmark delta vs baseline
4. Stop when delta is <5% improvement or we hit a target of {X}

Show me the benchmark first before any code changes.

9. Read-only investigation

Investigate {question or symptom} in {repo}. Do not edit any files in this pass.

Steps:
1. Read these entry points: {list}
2. Trace the relevant call graph
3. Output: a short writeup with (a) what is happening, (b) why, (c) 2-3 candidate fixes ranked by risk

I'll pick the fix; then we'll do a separate edit pass.

This is what Plan Mode is built for. You can also let Claude spin up its read-only Explore subagent to trace the call graph without burning context on your main session.

10. Diff review before merge

Below is the full diff of {branch}. Review for:
- correctness vs the original ticket: {paste ticket}
- regressions in {sibling area that often breaks}
- missing tests for new behavior
- naming and structural drift from the conventions in {file}

Output: a structured comment list — line + concern + suggested change.

{paste diff}

11. Failing-CI debug

CI is red on {branch}. The failing job: {job name}. Log excerpt: {paste relevant section, not the whole log}.

Diagnose:
1. What is the test asserting?
2. Why is it failing now? Recent change in {area}?
3. Smallest fix: change the test, change the code, or revert {commit}?

Show your reasoning before suggesting a fix.

12. Dependency upgrade with risk map

Upgrade {dep} from {old} → {new} in {repo}.

Output a risk map first:
- Breaking changes in the changelog that touch our usage
- Files in our code that hit the changed surface
- Tests that should be added or expanded
- Rollback plan

Then propose the smallest first PR. Do not bundle this with unrelated changes.

Which model for which prompt

Claude Code runs Anthropic models only. Switch with /model; below is a sensible default split (as of June 2026).

Prompt type	Model	Why
Surgical bug fix, single-file change, spike	Sonnet 4.6	Fast, cheap, plenty smart for scoped work; the default for most sessions
Multi-file refactor, migration planning, dependency risk map	Opus 4.7	Deeper reasoning across files; SWE-bench Verified 87.6% vs Gemini 3.1 Pro 80.6%
Long read-only investigation across a large repo	Sonnet 4.6 (or Opus Plan via `/model` option 4)	1M-token context on both; let the Explore subagent do the reading
Diff review before merge	Sonnet 4.6	A structured comment list rarely needs Opus-level reasoning

A common money-saver is “Opus Plan” mode: Opus 4.7 drafts the plan, then Sonnet 4.6 executes it, so you pay for top-tier reasoning only on the part that needs it.

Common mistakes

“Improve this code” with no scope — Claude Code rewrites half the file. Always name the files it may touch and the ones it must not.
Skipping the read-first list — Claude infers patterns from its training rather than from your repo, and the diff drifts from your conventions.
No “do not touch” constraint — adjacent files get drive-by edits that bloat the PR.
Asking for code before asking for a plan — you lose the cheapest checkpoint. Use Plan Mode so it can’t skip the gate.
Bundling refactor plus feature in one PR — neither half is reviewable in isolation, and a bad review forces an all-or-nothing revert.
Trusting /rewind to undo everything — it tracks Claude’s file edits only, not changes from bash commands like rm or mv. Commit before you let Claude run destructive shell steps.

FAQ

Do these prompts work with Codex or Cursor too? The four-part shape (read / may-change / must-not-touch / definition of done) is tool-agnostic, so yes. What is Claude Code specific is the wiring: Plan Mode, /rewind checkpoints, and the /model picker. In Cursor you would lean on its own plan and agent modes instead; in Codex CLI you would add an explicit “show the plan first, wait for approval” line, since there is no read-only gate enforced by the tool.

How do I make sure Claude actually reads files before planning? List the exact paths under “Files to read first” and run in Plan Mode. Plan Mode is read-only and will read your repo, but note that the Explore and Plan subagents skip your CLAUDE.md to stay fast — put load-bearing conventions directly in the prompt, not only in CLAUDE.md, when they must shape the plan.

Sonnet 4.6 or Opus 4.7 for daily coding? Default to Sonnet 4.6 — it handles the large majority of scoped tasks fast and cheaply. Reach for Opus 4.7 (/model) on multi-file refactors, migrations, and debugging that needs deeper reasoning. API pricing as of June 2026: Sonnet 4.6 is $3 in / $15 out per million tokens; Opus 4.7 is $5 / $25.

What if the plan looks wrong after Claude has already edited? Run /rewind (or Esc twice on an empty prompt) and restore the conversation, the code, or both to any earlier checkpoint. Checkpoints persist across sessions and are auto-cleaned after 30 days. Remember the bash-command caveat above.

Why ask for a 50-LOC spike instead of the real implementation? A spike is a throwaway probe to validate the approach before you commit. Reading 50 lines tells you whether the design holds together far faster than reviewing a 600-line “finished” PR you may have to scrap.

Tags: #Prompt #AI coding #Claude Code

TL;DR

Best for

1. Scoped feature implementation

2. Surgical bug fix

3. Migration with rollback path

4. Refactor in 3 phases

5. New module from spec

6. Test-driven implementation

7. Spike to validate an approach

8. Performance fix with measurement

9. Read-only investigation

10. Diff review before merge

11. Failing-CI debug

12. Dependency upgrade with risk map

Which model for which prompt

Common mistakes

FAQ

Related

Related Articles

Accessibility Regression Audit Prompts: 12 Templates Beyond axe-core

Bug Reproduction Prompts: 12 Templates for Minimal Repro Cases

Build Failure Investigation Prompts: 12 Templates for Red CI

Changelog Generation Prompts: 12 Templates for Useful Release Notes

CI/CD Pipeline Audit Prompts for Fast, Trustworthy Builds

Codebase Convention Detection Prompts: 12 Templates to Read a New Repo