AI Gives a List When You Asked It to Do the Work

Q: Claude Code keeps planning and never edits files. What's wrong?

You are almost certainly in Plan mode. Look for `⏸ plan mode on` at the bottom of the terminal. Plan mode intentionally blocks file edits and commands until you approve. Press `Shift+Tab` to leave it (Windows: `Alt+M`), then re-issue the request.

You asked the model to write, refactor, or draft something and got a 10-bullet plan instead. Here is why it switches to advice mode and the exact prompt edits that force a finished artifact.

Published: May 20, 2026 Updated: Jun 17, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

You asked the model to write a landing page, refactor a function, or draft a launch email. Instead you got a 10-bullet outline of “steps to write a landing page”, “considerations when refactoring”, or “what to include in a launch email”. The deliverable never arrived — only the meta-discussion about how to produce it. This is the planning-mode failure: the model treats your request as advice-seeking and returns a method instead of a finished artifact.

Fastest fix: change the main verb from how/what to an imperative (write, output, produce, refactor), pin the output type and length, and add one line: Produce the artifact directly, not a plan or list of steps. The same model that hands you a list when you ask "how should I write the hero copy" will produce the hero copy verbatim when you ask "Write the hero copy in 2 lines, max 14 words."

One thing that changed in 2026: the ChatGPT model picker (Instant / Thinking / Pro on GPT-5.5) and reasoning modes in general make this worse, not better. Thinking and Pro modes do explicit step-by-step work, so a loosely worded request is more likely to come back as a plan than it was on the old instant-only models. Tightening the verb matters more now, not less.

Which bucket are you in?

Symptom you see	Most likely cause	Jump to
Prompt verb is `how`/`what`/`explain`; output is a tutorial	”How” framing puts model in advice mode	Step 1
Output is numbered phases with no code/content under each	Task is large; model self-defends with a plan	Step 4
You wrote `describe`/`walk me through`; got narrative bullets	Output shape implicitly requested narrative	Step 1 + 3
Single follow-up like “now do step 1” still returns sub-steps	Earlier turns locked a planning frame	Step 6
Using Claude Code and it only ever plans, never edits files	You are in Plan mode (`⏸ plan mode on`)	See “If still broken”
Output is fine in chat but a list via API	No output type/schema pinned	Step 3 + “If still broken”

Common causes

Ordered by frequency.

1. “How” framing puts the model in advice mode

Classic offenders:

"How should I write the function?"
"What is the best way to structure this email?"
"How do I refactor this for readability?"

The model interprets these as method questions and returns a method. It is technically answering you — just not with the artifact.

How to spot it: your prompt’s main verb is how, what, explain, or approach instead of write, produce, output, or return.

2. Task is large; the model self-defends with planning

When the request is "build me a CRUD app" or "refactor this 800-line file", the model often emits a plan because the full artifact would blow past the output budget or feels too large to commit to in one shot. The plan is a safety valve. Reasoning modes (ChatGPT Thinking/Pro, Gemini 3.1 Pro, Claude Opus 4.7 with extended thinking) lean into this — their training rewards laying out a plan before acting.

How to spot it: output is a numbered list of phases, each labeled with a step name but no actual code or content under each.

3. Output shape implicitly requests narrative

Prompts like "explain how to do X" or "describe the steps to Y" ask for narrative, and narrative naturally takes list form. You did not ask for the artifact — you asked for the description of how to make it.

How to spot it: re-read your prompt; if the verb is descriptive (explain, describe, walk me through), the list is working as designed.

4. Earlier turns established a planning frame

If turn 1 was "let's plan the migration" and turn 5 is "now do the first step", the model often stays in planning mode and returns sub-steps instead of executing.

How to spot it: scroll up. Recent turns are about phases, milestones, or “how to approach”, not concrete artifacts.

5. The model thinks you want to learn, not to ship

Tutorial-style training data is heavily list-shaped. When the request looks like a learning question, the model defaults to teaching mode. "I want to understand X" triggers lists; "produce X" triggers artifacts.

How to spot it: the output reads like a tutorial section heading, with phrases like "first, you should consider...".

6. Verbosity bias on contradictory constraints

If you ask for two things at once — "keep it short but cover everything" — the model resolves the conflict by being verbose, because more output feels “more complete”. A list is the verbose option, so it picks the list. This is a documented failure mode of current chat models, not a you-problem.

How to spot it: your prompt contains conflicting constraints (concise + comprehensive, brief + detailed), and the output hedges by listing.

7. Ambiguous “give me X” where X could be a doc or a list

"Give me a marketing plan" is ambiguous — is X a plan document, or a list of marketing steps? Models often choose the list because lists feel safer than committing to an opinionated document.

How to spot it: output is bulleted; the deliverable could just as well have been prose, a table, or a code block.

Before you change anything

Confirm whether the issue happens in chat UI or via API; behavior and fixes differ.
In ChatGPT, check which mode the picker is on (Instant / Thinking / Pro). Thinking and Pro plan more.
Write down the exact prompt, the model and mode, the system prompt, and any prior turns.
Save the list output verbatim so you can diff it against the rewrite.
Note whether the task is genuinely large enough to need a plan first.
Check whether the prompt accidentally asked a “how” question instead of issuing an instruction.

Shortest fix path

Ordered by ROI.

Step 1: Rewrite “how” into an imperative verb

Planning prompt	Execution prompt
`"How should I write the function?"`	`"Write the function. Signature: getUser(id: string): Promise<User>. Return runnable TypeScript."`
`"What's the best way to write this email?"`	`"Write the email. Subject line + 4-sentence body. Tone: warm, direct."`
`"How do I refactor this for readability?"`	`"Output the refactored code in full. Same behavior. Maximum function length: 20 lines."`

The verb shift from how to write/output/produce does most of the work.

Step 2: Forbid planning vocabulary

Append:

Output rules:
- Produce the artifact directly, not a plan or list of steps.
- Do not write "first, you should..." or "here is how to approach this".
- No section headings unless the artifact itself requires them.
- Skip any preamble; start with the first line of the artifact.

Step 3: Pin the artifact type and shape

Tell the model exactly what type the output is:

Output type: TypeScript function, single file.
No comments unless explaining non-obvious logic.
Maximum 40 lines.

Or:

Output type: 3-paragraph blog intro.
Length: 80-120 words.
First sentence must contain a specific number or proper noun.

A typed return value blocks the model from substituting a list. If you are on the API, do not rely on prose alone — enforce the shape (see “If still broken”).

Step 4: For large tasks, split plan and execute into separate turns

If the task genuinely needs a plan, do it explicitly:

Turn 1: "List the 4 sub-tasks needed to refactor this file. No code yet."
Turn 2: "Execute sub-task 1 in full. Output the code, not a plan for the code."

Never combine in one prompt — the second half regresses to a list. And keep the plan concrete: a “plan” of figure out the right approach / investigate the issue is not a plan and will not help the execute turn.

Step 5: Provide a result example, not a plan example

Few-shot the artifact:

Example output:
---
export async function getUser(id: string): Promise<User> {
  const row = await db.query.users.findFirst({ where: eq(users.id, id) });
  if (!row) throw new NotFoundError(`user ${id}`);
  return row;
}
---
Now write the same shape for getOrder(id: string).

The model copies the shape of the example, so an artifact example yields an artifact. A plan example yields a plan — so never show a numbered list as your example.

Step 6: Reset if the conversation is stuck in planning

If turns 1-5 were planning and turn 6 is execution, start a new conversation with only the relevant context. Long planning threads accumulate framing that single follow-ups cannot dislodge. Paste the one input the execute turn needs (the file, the spec) and issue the imperative cold.

How to confirm the fix

The output is a finished artifact (code, prose, table, JSON), not a numbered list of steps.
A teammate could ship the output as-is without further generation.
The first 2 lines of the output are the artifact, not preamble.
No phrases like "first, you should consider..." or "here is how to approach...".

If still broken

Using Claude Code? Check the bottom of the terminal. If it reads ⏸ plan mode on, you are in Plan mode — it analyzes and plans but will not edit files or run commands by design. Press Shift+Tab to cycle out of it (on Windows, Alt+M), or start the session in normal mode rather than claude --permission-mode plan.
Switch from chat UI to the API and enforce a schema. This is the reliable fix when prose instructions fail.
- OpenAI: pass a strict JSON Schema. JSON Mode (response_format: { type: "json_object" }) is legacy as of 2026 — it only guarantees valid JSON, not your shape. Use response_format: { type: "json_schema", json_schema: {..., strict: true} } (Chat Completions) or text.format in the Responses API.
- Anthropic: prefill is no longer supported on Claude Opus 4.7 / Sonnet 4.6 (it returns a 400 “Prefilling assistant messages is no longer supported”). Use structured outputs via output_config.format with a JSON schema, or a system prompt that names the exact output type.
- Google: set responseMimeType: "application/json" plus a responseSchema (works with Zod/Pydantic) on Gemini 3.1 Pro.
Define the schema as { "artifact": "string" }, not { "steps": ["string"] }. The field name itself steers the model away from a list.
In ChatGPT, drop to Instant mode for short, well-specified execution tasks; reserve Thinking/Pro for genuinely hard reasoning. A picker mode that plans by default will keep planning.
Lower temperature to around 0.3 for execution tasks; 0.7-1.0 invites verbose meta-commentary.
Use a stronger model if a small model is hitting a capability ceiling and falling back to lists.
Set reasoning depth via the API parameter where one exists, not by writing “really think hard about this” in the prompt — that just burns tokens that could carry your actual instruction.

Prevention

Maintain separate “plan” and “execute” prompt templates; never mix them in one turn.
For execution prompts, banned words: approach, consider, you might, steps to.
Audit your last 10 prompts: count how vs write/output/produce. Aim for 80% imperative.
Use few-shot examples of artifacts, not of plans.
In tools that support structured output, define the schema as { artifact: string }, not { steps: string[] }.

FAQ

Why does GPT-5.5 Thinking give me a plan when GPT-5.5 Instant just did the task? Thinking and Pro modes spend explicit reasoning before answering, and that reasoning often surfaces as a visible plan. For short, fully specified work (write this function, draft this email), switch the picker to Instant. Save Thinking/Pro for tasks where the hard part is the reasoning, not the typing.

I told it “be concise” and it still gave a long list. Why? Concise plus a vague verb is a contradiction the model resolves by listing — a list feels like complete coverage. “Concise” is not an output type. Replace it with a concrete type and limit: Output type: one paragraph, max 60 words.

Does forcing a JSON schema actually stop the list problem? Yes, more reliably than any prose instruction. A strict schema with a single artifact: string field gives the model nowhere to put a list. Use OpenAI json_schema strict mode, Anthropic output_config.format, or Gemini responseSchema. Avoid a steps: string[] field unless you actually want steps.

Claude Code keeps planning and never edits files. What’s wrong? You are almost certainly in Plan mode. Look for ⏸ plan mode on at the bottom of the terminal. Plan mode intentionally blocks file edits and commands until you approve. Press Shift+Tab to leave it (Windows: Alt+M), then re-issue the request.

Should I just ask it to “do it, don’t explain”? That helps but is not enough on its own. Pair it with a named output type and length. The strongest single line is: Produce the artifact directly, not a plan or list of steps; start with the first line of the output.

Tags: #Troubleshooting #Prompt #Prompt quality #Prompt engineering