How do I tell if thinking actually ran?

Look for the `Thinking` indicator above the reply (it shows a timer while running). A reply that took several seconds and reads like reasoned analysis is another signal.

Why does turning thinking up not always produce a trace?

Adaptive thinking is per-turn. At `High`/`Max` it almost always thinks, but a trivially easy prompt can still be answered directly. Give it a hard prompt to confirm.

What is the difference between Effort and the old Extended Thinking toggle?

Opus 4.7, Opus 4.6, and Sonnet 4.6 replaced the fixed thinking-token budget with an `Effort` level (`Low`/`Medium`/`High`/`Max`, plus `Xhigh` on Opus 4.7). `High` is the default and equals not setting it.

Does thinking cost more?

Yes. Higher Effort spends more tokens (thinking and response), so it counts against your usage limits on Free, Pro, and Max, and against billing on the API.

How do I force thinking on via the API?

On Opus 4.7/4.6 and Sonnet 4.6, set `thinking: {type: "adaptive"}` and `output_config: {effort: "high"}` (or `"max"`). On Opus 4.7, the old `thinking: {type: "enabled", budget_tokens: N}` is no longer supported; use adaptive thinking with effort instead.

Does thinking work with tools?

Yes, but the visible trace may be replaced by the tool call. The reasoning still happens; it just moves into the tool plan.

Troubleshooting

Claude Thinking Toggle On but No Reasoning Trace

You enabled extended thinking and Claude replies instantly with no visible reasoning. As of June 2026 it is usually a low effort level, an easy prompt under adaptive thinking, a collapsed trace, or plan limits. Diagnose and force thinking back on.

Published: May 24, 2026 Updated: Jun 15, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

You turn on thinking, ask a question that clearly needs reasoning, and Claude fires back an answer in a second or two with no Thinking trace above it. The toggle still looks on, and you wonder whether thinking is silently broken.

Fastest fix: click the model name next to the send button, hover Effort, and set it to High or Max, then resend a genuinely hard prompt. On the current Claude models (Opus 4.7, Opus 4.6, Sonnet 4.6) thinking is adaptive — the model decides per turn whether to think, and at Low/Medium effort it skips thinking on easy prompts. High (the default) and Max make it almost always think.

The big change to know: as of June 2026 there is no single token-budget “Extended Thinking” switch on these models anymore. Thinking depth is controlled by an Effort setting, and a stray Low effort level is now the number-one reason a hard prompt comes back with no trace.

Common causes

Ordered roughly by how often each is the actual reason on Claude.ai as of June 2026.

#	Cause	Fastest check
1	Effort set to `Low`/`Medium`, so adaptive thinking skipped the prompt	Model name → Effort → set `High` or `Max`, resend
2	Prompt was easy enough that the model chose not to think	Send a deliberately hard prompt (see Step 3)
3	Trace rendered but the `Thinking` accordion is collapsed	Look just above the reply for a `Thinking` pill; click it
4	A tool call (web search, code) replaced the visible trace	Did the reply run a tool? Reasoning moved inside the tool plan
5	On a non-effort/older model, the `Extended` toggle is off	Model name → switch the `Extended` toggle on
6	Plan usage limits throttled you to a lighter routing	Check plan/usage; lapsed Pro falls back to Free routing
7	Very long conversation squeezed out the thinking step	Start a fresh chat and resend

1. Effort level is too low (most common now)

Opus 4.7, Opus 4.6, and Sonnet 4.6 use adaptive thinking: instead of a fixed thinking budget, you pick an Effort level (Low, Medium, High, Max, plus Xhigh on Opus 4.7). Per Anthropic’s docs, at High (the default) and Max “Claude will almost always think,” but “at lower effort levels, it may skip thinking for simpler problems.” If your Effort is on Low or Medium, that alone explains a missing trace.

How to judge: Click the model name next to the send button, hover Effort, and read the current level. If it is Low or Medium, raise it to High or Max.

2. Prompt was too easy to trigger a thinking pass

Even at High effort, adaptive thinking can answer a short, direct prompt (What is 2+2?) without a visible reasoning pass. The model scopes its work to what was asked.

How to judge: Try a clearly hard prompt: Walk me through how you would design a rate limiter that handles 10k QPS with burst absorption. If a trace appears for this one, the prior prompt simply was not hard enough to warrant thinking.

3. Thinking trace rendered but the accordion is collapsed

The trace shows as a small Thinking indicator (with a timer while it runs) just above the reply. It is easy to miss, especially on mobile. Clicking expands the summarized reasoning.

How to judge: Look above the assistant’s reply for a small collapsed Thinking banner. Click to expand.

4. A tool call replaced the thinking pass

When Claude decides to run a tool (web search, code execution), the visible reasoning often moves inside the tool plan instead of a separate Thinking block.

How to judge: Did the reply run a tool? If yes, that is why no standalone thinking trace appeared.

5. On an older/non-effort model the toggle is a plain `Extended` switch

Models without effort controls expose a simple Extended toggle instead of an Effort submenu. If you switched models, the control you remember may not be where you expect.

How to judge: Click the model name next to the send button. If you see an Extended toggle (not an Effort submenu), make sure it is on.

6. Plan or usage limits throttled your routing

The Free plan now includes Sonnet 4.6 with adaptive thinking, but with limited usage; heavy use can throttle you. A lapsed Pro/Max subscription silently falls back to Free-tier routing and limits.

How to judge: Profile → Settings, confirm the subscription is active, and check whether you have hit a usage cap.

7. Conversation context is very long

In long threads, the model can keep responding while trimming the explicit thinking step to stay efficient.

How to judge: Roughly how long is the thread? Past ~50 turns or six figures of tokens, thinking is the first thing to get squeezed. Start a fresh chat to confirm.

Before you start

Confirm which model is selected for the conversation (Opus 4.7, Opus 4.6, or Sonnet 4.6). Routing affects which thinking control you see.
Decide whether you need thinking for this one prompt or want a higher Effort level by default.
Have one clearly hard test prompt ready to verify thinking is actually firing.

Information to collect

Account plan (Free, Pro, Max, Team, Enterprise).
Model selected at the top of the conversation.
Current Effort level (or whether the Extended toggle shows on).
Conversation length (rough token or message count).
One prompt that triggered thinking and one that did not.
Browser, device, and any extensions that might intercept UI state.

Step-by-step fix

Step 1: Raise the Effort level

Click the model name next to the send button. Hover Effort and select High or Max. On Opus 4.7 you can also pick Xhigh (it sits between High and Max, for long, exploratory tasks). Settings apply starting with Claude’s next response, so resend your prompt.

Step 2: Verify the model supports thinking

Still in the model menu, confirm you are on Opus 4.7, Opus 4.6, or Sonnet 4.6. If a model exposes a plain Extended toggle instead of an Effort submenu, switch that toggle on. If you are on a model with neither, switch to one of the three above.

Step 3: Send a deliberately hard prompt

Test with: Explain the trade-offs between optimistic locking and pessimistic locking for a 10-tenant SaaS, with examples. At High or Max effort, a clearly hard prompt should show a visible Thinking indicator within a few seconds of starting.

Step 4: Expand the thinking accordion

Look above the reply for a small Thinking banner. Click to expand. If the trace is in there, thinking was on the whole time and you only missed the collapsed UI.

Step 5: Start a fresh conversation

If a long thread has squeezed out the thinking step, start a new conversation, set Effort to High or Max, and re-paste the prompt. New conversations get the cleanest routing.

Step 6: Check plan and usage status

Profile → Settings. Confirm your subscription is active and you have not hit a usage cap. Lapsed Pro/Max accounts silently fall back to Free-tier routing and limits, which throttles thinking depth.

Step 7: Report a persistent failure

If thinking never triggers across fresh conversations, hard prompts at High/Max effort, and a confirmed active plan, file a ticket at support.claude.com with conversation IDs, the model name, the Effort level, and screenshots of the toggle state.

How to confirm it is fixed

A Thinking indicator (with a timer while it runs) appears above the reply on hard prompts.
Expanding it shows multi-step reasoning, not a one-liner.
Reply latency is noticeably higher (roughly 5-30 seconds) than a non-thinking reply.
Repeating the prompt in a new conversation at the same Effort level also shows thinking.

Long-term prevention

Set a higher default Effort (High or Max; Xhigh on Opus 4.7) if you routinely want thinking, since adaptive thinking only reliably fires at High and above.
Phrase hard prompts so they obviously reward reasoning (walk me through, compare and contrast, design and justify).
For Projects that should always reason, add a custom instruction: For any prompt requiring multi-step reasoning or trade-off analysis, think carefully before responding.
Periodically re-check the Effort level on long-running conversations; it can drift when you switch models.
For automated workflows via the API, set effort and thinking explicitly per request (see FAQ).

Common pitfalls

Leaving Effort on Low/Medium and concluding thinking is broken when adaptive thinking simply skipped an easy prompt.
Trusting the toggle without verifying with a genuinely hard test prompt.
Missing the collapsed Thinking accordion, especially on mobile.
Forgetting that a tool call can replace the visible thinking step.
Assuming a fixed token budget still exists; on Opus 4.7 and Sonnet 4.6 it is replaced by Effort.

FAQ

How do I tell if thinking actually ran? Look for the Thinking indicator above the reply (it shows a timer while running). A reply that took several seconds and reads like reasoned analysis is another signal.
Why does turning thinking up not always produce a trace? Adaptive thinking is per-turn. At High/Max it almost always thinks, but a trivially easy prompt can still be answered directly. Give it a hard prompt to confirm.
What is the difference between Effort and the old Extended Thinking toggle? Opus 4.7, Opus 4.6, and Sonnet 4.6 replaced the fixed thinking-token budget with an Effort level (Low/Medium/High/Max, plus Xhigh on Opus 4.7). High is the default and equals not setting it.
Does thinking cost more? Yes. Higher Effort spends more tokens (thinking and response), so it counts against your usage limits on Free, Pro, and Max, and against billing on the API.
How do I force thinking on via the API? On Opus 4.7/4.6 and Sonnet 4.6, set thinking: {type: "adaptive"} and output_config: {effort: "high"} (or "max"). On Opus 4.7, the old thinking: {type: "enabled", budget_tokens: N} is no longer supported; use adaptive thinking with effort instead.
Does thinking work with tools? Yes, but the visible trace may be replaced by the tool call. The reasoning still happens; it just moves into the tool plan.

Tags: #Claude #Troubleshooting #thinking