Why not one big prompt?

One prompt mixes research, decision, and execution, so drift is inevitable. A multi-agent chain forces separation of concerns and gives each step its own clean context window.

Doesn't multi-agent cost more tokens?

Each subagent does run separately, but it works in its own window and returns only a summary, so it can be *cheaper* than one bloated session — and a failed monolith costs more in retries. Route read-only steps to Haiku to cut cost further.

Where do I define a custom subagent?

A Markdown file with YAML frontmatter in `.claude/agents/` (project) or `~/.claude/agents/` (personal). Only `name` and `description` are required. Run `/agents` to scaffold one interactively.

Can I run agents in parallel?

Yes, when their work is independent (template 7). Use the Task tool with `run_in_background` and converge with a CONSOLIDATE agent. Otherwise serialize.

Should the reviewer agent see the plan?

Yes. Without the plan it can only judge whether the code works, not whether it did what was asked.

When is multi-agent overkill?

For tasks under ~30 minutes of work or touching fewer than 3 files, a single focused prompt is faster.

Prompt Library

Multi-Agent Handoff Prompts for Claude Code Subagents

12 prompt templates to hand work between Claude Code subagents — research, plan, implement, review, ship — without losing context. Updated for June 2026.

Published: May 19, 2026 Updated: Jun 14, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

Most multi-agent runs drift because nothing pins the handoff: the research agent returns three loose paragraphs, the implementer invents a different design, the reviewer argues with both. A good handoff prompt fixes that by specifying three things — what the producing agent must output, what the consuming agent must read first, and what is explicitly out of scope. The 12 templates below are built around that contract, and they map directly onto how Claude Code subagents actually work as of June 2026.

TL;DR

A handoff prompt is a contract: producer output shape, consumer required reads, out-of-scope list, and a STOP path when the shape is wrong.
In Claude Code, each subagent runs in its own context window, so file-based handoffs (one input file, one output file) survive context resets far better than message-based ones.
Built-in subagents (Explore on Haiku, Plan, general-purpose) cover most chains; define custom ones in .claude/agents/*.md when you keep spawning the same worker.
Skip multi-agent for anything under ~30 minutes or fewer than 3 files; the handoff cost is real.

Who this is for

Engineers wiring up Claude Code subagent workflows, solo founders running AI as the whole team, and developers using the Claude Agent SDK to chain calls beyond a single turn.

How Claude Code subagents work (June 2026)

The templates assume Claude Code’s current subagent model, so a quick grounding:

Separate context windows. Each subagent runs with its own context, system prompt, and tool permissions. It does the noisy work (searches, logs, file dumps) in isolation and returns only a summary to the main conversation. That is the whole reason a handoff chain saves context.
Built-in subagents. Claude Code ships Explore (fast, read-only, runs on Haiku, denied Write/Edit so it can’t mutate the repo), Plan (read-only, used in plan mode), and general-purpose (inherits all tools). Many chains below need no custom agents at all.
Custom subagents. Define them as Markdown files with YAML frontmatter in .claude/agents/ (project scope, check into git) or ~/.claude/agents/ (personal). Only name and description are required; useful optional fields are tools (allowlist), model (sonnet, opus, haiku, fable, a full ID like claude-opus-4-7, or inherit — the default), and effort. Run /agents to create or edit them interactively. Claude picks a subagent automatically by matching your task against each agent’s description, so write that line as trigger conditions, not a summary.
Auto-compaction. When a session nears ~95% of its context window, Claude Code summarizes earlier turns and continues. Your handoff files are the safety net if a compaction loses a detail mid-chain.
Agent teams vs subagents. Subagents work within one session and return to a parent. Agent teams are a separate feature where sessions communicate with each other; the file-based contracts here work for both.

Prompt anatomy

Every handoff prompt should carry six elements:

Role: who the agent plays (SRE, release captain, staff engineer, QA lead).
Context: stack, branch, failing logs, diff, dashboard URL.
Goal: one concrete deliverable — root cause, checklist, plan, ticket list, runbook.
Constraints: what the agent MUST NOT do (don’t auto-fix, don’t invent file paths).
Output format: numbered findings, markdown table, JSON, unified diff, runnable code.
Examples / signal: one or two “good” output samples, or a counter-example.

Best for

Research, plan, code, review pipelines
Spec, tests, implementation TDD chains
Parallel agents converging on a shared artifact
Long-running tasks split across agents to control context
Subagent runbooks committed to a repo

12 copy-ready prompt templates

1. Handoff contract template

Before agents run, define the handoff: (1) Producer agent: name + one-sentence job, (2) Output schema (markdown / JSON / file path), (3) Out-of-scope items, (4) Consumer agent: name + what it MUST read first, (5) Failure path: if producer output is wrong shape, consumer says STOP. Output as YAML.

2. Research to Plan

You are the RESEARCH agent. Output a single markdown doc at `notes/research.md` with: (1) the question, (2) 3 candidate approaches, (3) for each: pros, cons, one risk. Do NOT propose a final plan — that's the plan agent's job. Do NOT write code. Stop after writing the file.

Tip: in Claude Code this maps cleanly onto the built-in Explore subagent, which is read-only and Haiku-backed, so it’s cheap to run first.

3. Plan to Implementation

You are the PLAN agent. Read `notes/research.md`. Output `notes/plan.md` with: (1) chosen approach + 1-line rationale referencing research, (2) numbered implementation steps (<= 10), (3) files to touch, (4) tests to add, (5) explicit "do not change" list. No code yet.

4. Implementation to Review

You are the IMPLEMENT agent. Read `notes/plan.md`. Execute exactly the numbered steps. After each step, append a one-line entry to `notes/progress.md`. Do not deviate from the plan; if you find a problem, write it to `notes/blockers.md` and STOP. No autonomous decisions.

5. Review to Ship

You are the REVIEW agent. Read `notes/plan.md` + the diff. For each plan item, mark: DONE / PARTIAL / SKIPPED + evidence (file:line). For SKIPPED / PARTIAL: open a follow-up note. Output `notes/review.md`. Do not change code. If anything is unsafe to ship, write `BLOCK` at top.

6. Spec to Tests to Implementation

Three-agent chain: (a) SPEC agent: write `spec.md` from `[prd-link]`. (b) TEST agent: write failing tests from spec only, never reading source. (c) IMPL agent: make tests pass without changing tests. Output each agent's instructions separately. No agent does the next agent's job.

Variables to swap: [prd-link]

7. Parallel agents to consolidation

Run two agents in parallel on the same task: (a) AGENT-A pursues solution X, (b) AGENT-B pursues solution Y. Both write to `notes/agent-a.md` / `notes/agent-b.md`. A third CONSOLIDATE agent reads both, recommends one with rationale, lists differences. Don't let A or B see each other's output.

In Claude Code, spawn A and B via the Task tool with run_in_background so they run concurrently, then start CONSOLIDATE once both files exist.

8. Context shrinkage handoff

For long sessions approaching the context limit (Claude Code auto-compacts around 95%, but an explicit wrap is more reliable).

You are the WRAP agent. Read the current session and write `notes/session-summary.md` covering: (1) what was decided, (2) what code was changed (file:line summary), (3) open questions, (4) the exact next prompt to feed a fresh agent. Don't make new decisions.

9. Subagent dispatch from a router

You are the ROUTER. Given task: [task], decide which subagent to invoke: RESEARCH (unknowns) / PLAN (knowns, no plan yet) / IMPLEMENT (plan exists) / REVIEW (code exists) / WRAP (long session). Output: agent, reason, prompt-to-pass. Don't do the work — only route.

Variables to swap: [task]

10. Failure handoff

If you (any agent) cannot complete your task, write `notes/blocker.md` with: (1) what you were doing, (2) what stopped you, (3) what info you need, (4) suggested next agent. Do NOT speculate — leave gaps blank. Pass control back to the router.

11. Agent role definitions in CLAUDE.md

Write a `CLAUDE.md` section that defines our subagent roles: RESEARCH / PLAN / IMPLEMENT / REVIEW / WRAP. For each: (1) responsibility, (2) inputs they read, (3) outputs they write, (4) what they MUST NOT do. <= 200 words per role. Plain text — no marketing prose.

Better still, promote each role to a real file in .claude/agents/: a description written as trigger conditions, a tools allowlist, and a model (use haiku for read-only research, sonnet for implementation, opus for review).

12. Handoff debugging

A handoff produced wrong output. Diagnose: (1) Was the contract clear (output shape, out-of-scope list)? (2) Did the consumer agent read the producer's output? (3) Did the producer's context include the consumer's requirements? Output one root cause + one prompt change.

Common mistakes

No handoff contract — every run drifts differently.
Letting one agent both plan and implement — the plan just rationalizes whatever it coded.
Sharing all context with every agent — each drowns in noise.
No “stop and surface” path — agents fabricate when stuck.
Re-running the whole chain to fix the last step — wastes tokens and time.
Agents reading each other’s scratchpads — re-litigates settled decisions.
No router — chains become a tangle of conditionals.
Skipping the description field on a custom subagent, or writing it as a summary instead of trigger conditions — Claude then can’t decide when to delegate.

How to push results further

Make handoffs file-based, not message-based. Files survive context compaction and session restarts; subagent return summaries do not.
Give each agent ONE input file and ONE output file. Smaller surface, fewer drift points.
Have a different agent write the tests than writes the implementation — it catches the bugs the implementer is blind to.
Always include a “STOP and surface” instruction; a blocker note beats a confident fabrication.
Keep the WRAP agent on standby for long sessions as your context safety net.
Commit role definitions to .claude/agents/ so the chain is your team’s shared, version-controlled mental model.
Log each handoff (agent, file, commit SHA) so you can replay exactly where a chain went wrong.
Route cheap read-only work to Explore/Haiku and reserve Opus 4.7 for the review step, where reasoning depth pays off.

FAQ

Why not one big prompt?: One prompt mixes research, decision, and execution, so drift is inevitable. A multi-agent chain forces separation of concerns and gives each step its own clean context window.
Doesn’t multi-agent cost more tokens?: Each subagent does run separately, but it works in its own window and returns only a summary, so it can be cheaper than one bloated session — and a failed monolith costs more in retries. Route read-only steps to Haiku to cut cost further.
Where do I define a custom subagent?: A Markdown file with YAML frontmatter in .claude/agents/ (project) or ~/.claude/agents/ (personal). Only name and description are required. Run /agents to scaffold one interactively.
Can I run agents in parallel?: Yes, when their work is independent (template 7). Use the Task tool with run_in_background and converge with a CONSOLIDATE agent. Otherwise serialize.
Should the reviewer agent see the plan?: Yes. Without the plan it can only judge whether the code works, not whether it did what was asked.
When is multi-agent overkill?: For tasks under ~30 minutes of work or touching fewer than 3 files, a single focused prompt is faster.

Tags: #Prompt #Coding #Claude Code #Agents #Multi-agent