Claude Code Stops Mid-Task Due to Token / Usage Limits

Halfway through a refactor, Claude Code says you hit a limit. Identify which limit (5-hour / weekly / per-model), save state, pick the right resumption path.

You’re in the middle of a multi-file refactor. Claude Code stops generating output and says “You’ve reached your usage limit.” Your work is half-applied — three files changed, two more still needed. The session is dead. Tomorrow morning you’ll come back to the same project with no agent memory of where you were.

Claude Code limits are layered: a 5-hour rolling window per model, a weekly cap across models, and a Sonnet-specific weekly cap. Hitting any of them stops a task; rollover isn’t instant. The fix isn’t “wait it out” — it’s identifying which limit you hit, capturing state for resume, and structuring future work so a limit hit doesn’t cost you the whole session.

Common causes

Ordered by hit rate, highest first.

1. 5-hour rolling window exhausted on the current model

Each model has its own 5-hour bucket. Opus has its own window; Sonnet has its own. Burning through Opus’s window doesn’t affect Sonnet’s, and vice versa.

How to spot it: The error message names the model and gives a reset time in hours (e.g., “resets in 3h 24m”). That’s the 5-hour window.

2. Weekly cap (across models) exhausted

Pro and Max plans have a weekly token cap across all models. Even if your 5-hour Opus window is fresh, if the weekly cap is hit you can’t use any model until the weekly reset.

How to spot it: Error names “weekly” or gives a reset in days. /limits shows weekly bucket near 100%.

3. Sonnet-specific weekly cap exhausted while Opus has runway

Some plans have a separate Sonnet weekly cap that’s smaller. Sonnet stops first; switching to Opus may still work. Surprising if you weren’t aware of the per-model breakdown.

How to spot it: Sonnet errors out but /model opus still functions.

4. Long single session was counted as one big bucket

A 4-hour Claude Code session can burn through most of a weekly cap if tool calls are dense. Many Read / Grep / Bash calls multiply token use beyond what your chat turn count suggests.

How to spot it: Session was long with heavy tool use. Weekly bucket dropped much faster than expected.

5. Multiple parallel sessions sharing quota

You opened two Claude Code instances (one terminal, one IDE), or you’re running an automated agent in the background. All share the same account’s quota — they compete.

How to spot it: Limits hit faster than your active usage explains. Check if there’s a background agent or forgotten session.

6. You’re confusing a 429 / capacity error with a quota limit

“Claude is at capacity, try again later” is a server-side transient throttle, not your quota. Sometimes shown alongside genuine quota errors and gets misread.

How to spot it: Error message says “at capacity” or “try again” without a numeric reset time. That’s a 429 — wait 1-2 minutes and retry, don’t restructure your session.

Shortest path to fix

Ordered by urgency. Step 1 (capture state) is non-negotiable before you take any other action.

Step 1: Stop immediately and capture state to disk

Every retry burns more quota. Save state first:

Before doing anything else:
1. Write a `STATUS.md` with:
   - Task: <one-sentence>
   - Files changed so far (full paths)
   - Files still needed (full paths)
   - Decisions made (use REST not GraphQL, etc.)
   - Next step if resuming
2. `git add -A && git commit -m "wip: <task>"`

A checkpoint commit + STATUS.md is your resume point.

Step 2: Identify which limit you hit

In the chat or via /limits (if your client supports it):

- "resets in Xh" → 5-hour rolling window. Wait or switch model.
- "weekly" / "resets in X days" → weekly cap. Wait or switch model tier.
- "at capacity" → transient 429. Wait 1-2 minutes and retry.
- Model-specific (Sonnet/Opus) → only that model's bucket. Try the other.

Different limits = different recovery paths. Don’t blindly wait.

Step 3: Switch model tier for recovery, if appropriate

If only one model’s window is out:

/model haiku
# or
/model sonnet

Haiku is cheaper per token and often has a fatter quota tier. Use it for:

  • Continuing low-risk parts of the task (file moves, simple edits)
  • Reading and summarizing code
  • Status report generation

Reserve the more capable model for the next non-trivial reasoning step when its window resets.

Step 4: Resume from STATUS.md, not from memory

When ready to resume (new session), the prompt is:

Read STATUS.md. It has the full state of the prior session.
Resume from "next step." Do not redo completed work.
Verify against `git log` what's already committed.

The fresh session has full context for the snapshot — no fading-memory risks.

Step 5: Plan future tasks in smaller checkpoints

The bug isn’t really hitting the limit — it’s that one big task wasn’t checkpointed:

Break tasks into units that fit in ~30 minutes of agent time:
- Unit 1: implement the new schema migration
- Unit 2: write the migration tests
- Unit 3: update the consumer code
- Unit 4: update the tests

After each unit:

  1. Commit
  2. Write a 1-line update to STATUS.md
  3. Pause or proceed

A limit hit between units costs you nothing.

Step 6: Avoid parallel sessions on the same plan

Two Claude Code instances on the same account = quota competition. Either:

  • One instance, sequential work
  • Or different accounts (team plan), each with their own quota

If you’re using parallel sub-agents internally (Task tool), they share the parent’s quota — those are fine, but watch the aggregate.

Prevention

  • Plan tasks in ~30-minute checkpoints; commit + STATUS.md after each
  • Distinguish 5-hour vs weekly vs 429 limits before reacting — different paths
  • Use cheaper models (Haiku) for low-risk steps; reserve Opus for hard reasoning
  • Never run parallel Claude Code sessions on the same plan and account
  • Watch /limits proactively; restart at a clean boundary before hitting the cap
  • Don’t plan critical deadlines around assumed quota — Anthropic’s caps can shift; check the official limits doc

Tags: #Troubleshooting #Claude Code #Debug #Usage limit