Claude Extended Thinking Toggle On but No Thinking Traces

You enabled extended thinking and Claude replies instantly with no visible reasoning. Usually plan limits, prompt routing, or a stale toggle — diagnose and force thinking back on.

You flip on the Extended Thinking toggle, ask a question that clearly needs reasoning, and Claude fires back an answer in under two seconds with no “thinking” trace expanded above it. The toggle still looks on. You wonder if extended thinking is silently disabled, gated by your plan, or maybe it ran but the trace is collapsed. The root cause is usually one of: the toggle UI state is out of sync with the actual model routing, your plan does not include extended thinking on this model, the prompt was too short to trigger thinking, or the trace did render but it is hidden under a collapsed accordion. Each one has a quick check.

Common causes

Ordered roughly by how often each is the actual reason.

1. Toggle is on visually but reset server-side

The Extended Thinking toggle persists per conversation. Refreshing the page or switching models occasionally resets the server-side flag while the UI button still looks active.

How to judge: Toggle it off, save, toggle it back on, then send a fresh test prompt.

2. Plan does not include extended thinking on this model

Free tier and some Pro models do not expose extended thinking. The toggle appears but does nothing on those routings.

How to judge: Profile, Settings, Plan. Confirm extended thinking is listed as included. Check which model is selected for the conversation.

3. Prompt was too simple to trigger a thinking pass

Even with extended thinking on, Claude can skip the reasoning step if the prompt is short and direct (“What is 2+2?”). The model decides per-turn whether to think.

How to judge: Try a clearly hard prompt: “Walk me through how you would design a rate limiter that handles 10k QPS with burst absorption.” If a trace appears for this one, the prior prompt simply was not gating-worthy.

4. Thinking trace rendered but the accordion is collapsed

The trace shows as a small “Thinking” pill above the reply. Easy to miss, especially on mobile. Clicking expands the full reasoning.

How to judge: Look above the assistant’s reply for a small collapsed banner. Click to expand.

5. Conversation context too long, thinking budget exhausted

In very long conversations, the thinking budget gets eaten by context. The model continues responding but skips the explicit thinking step to stay under budget.

How to judge: How long is the conversation? Past 50 turns or 100k tokens, thinking often gets squeezed out.

6. A tool call replaced the thinking pass

When Claude decides to use a tool (web search, code execution), the tool call sometimes replaces the visible thinking trace. Reasoning happens inside the tool plan but is not surfaced.

How to judge: Did the reply contain a tool call? If yes, that is why no separate thinking trace appeared.

Before you start

  • Confirm which model is selected for the conversation (Claude 4 Opus, Sonnet, etc.). Routing affects thinking availability.
  • Decide whether you need thinking for this specific prompt or want it on by default.
  • Have one clearly hard test prompt ready to verify thinking is actually firing.

Information to collect

  • Account plan (Free, Pro, Team, Enterprise).
  • Model selected at the top of the conversation.
  • Whether the toggle shows as on in the composer area.
  • Conversation length (rough token or message count).
  • Sample prompts that did and did not trigger thinking.
  • Browser, device, and any extensions that might intercept UI state.

Step-by-step fix

Step 1: Force-reset the toggle

In the composer, click Extended Thinking off. Send a quick test message. Then toggle it back on and send another. The reset re-syncs the server-side flag with the UI.

Step 2: Verify the model supports thinking

Click the model name at the top of the conversation. Confirm you are on a model that supports extended thinking (Claude 4 Opus, Sonnet variants). If you are on a model without it, switch.

Step 3: Send a deliberately hard prompt

Test with: “Explain the trade-offs between optimistic locking and pessimistic locking for a 10-tenant SaaS, with examples.” A clearly hard prompt should show a visible thinking trace within 5 seconds of starting.

Step 4: Expand the thinking accordion

Look above the reply for a small “Thinking” banner. Click to expand. If the trace is in there, thinking was on the whole time — you only missed the collapsed UI.

Step 5: Start a fresh conversation

If a long thread has exhausted the thinking budget, start a new conversation, toggle thinking on, and re-paste the prompt. New conversations get the full budget.

Step 6: Check plan and billing status

Profile, Settings, Plan. Confirm your subscription is active and the plan includes extended thinking on your chosen model. Lapsed Pro accounts silently fall back to Free-tier routing.

Step 7: Report a persistent failure

If thinking never triggers across fresh conversations, hard prompts, and confirmed plan support, file a ticket at support.anthropic.com with conversation IDs, the model name, and screenshots of the toggle state.

Verify

  • A “Thinking” banner appears above the reply on hard prompts.
  • Expanding the banner shows multi-paragraph reasoning, not just a one-liner.
  • Reply latency is noticeably higher (5-30 seconds) than non-thinking replies.
  • Repeating the prompt in a new conversation also shows thinking.

Long-term prevention

  • Get in the habit of phrasing hard prompts in a way that obviously rewards reasoning (“walk me through,” “compare and contrast,” “design and justify”).
  • For Projects that should always think, add a custom instruction: “Use extended thinking for any prompt requiring multi-step reasoning or trade-off analysis.”
  • Periodically check the toggle on long-running conversations; it can reset.
  • Pin the thinking-enabled model in Settings if you have a preferred one.
  • For automated workflows via the API, set the thinking parameter explicitly per request.

Common pitfalls

  • Trusting the toggle without verifying with a hard test prompt.
  • Asking simple questions and concluding thinking is broken — it just was not needed.
  • Missing the collapsed accordion, especially on mobile screens.
  • Forgetting that tool calls can replace the visible thinking step.
  • Assuming all models support thinking. Free-tier and some lightweight models do not.

FAQ

  • How do I tell if extended thinking actually ran? Look for the Thinking banner above the reply. Reply latency over 5 seconds with a clearly reasoned answer is another good signal.
  • Does extended thinking cost more? Yes, it consumes more tokens and is metered against your usage budget on Pro and Team.
  • Can I always force thinking on? No, the model decides per-turn whether to think, even with the toggle on. You can encourage it via prompt framing.
  • Is thinking the same as Constitutional AI safety reasoning? No, those are different layers. Extended thinking is exposed visible reasoning; safety review is not.
  • Why does the trace get truncated? Long traces get summarized in the UI. You can ask “show me your full reasoning” to get more.
  • Does thinking work with tools? Yes, but the visible trace may be replaced by tool calls. Reasoning still happens internally.

Tags: #Claude #Troubleshooting #thinking