Agent Rollback Was Incomplete: Roll Back by Domain, Not by git reset

Q: Why did the agent say "rolled back" when it clearly wasn't?

Because to the agent, "state" means the git working tree. `git reset --hard` made the tree match the target commit, so from its point of view the job is done. It has no model of your database, CDN, vendor dashboards, or env store, so it can't even see those domains, let alone revert them.

Q: `git reset --hard` vs `git revert` — which should I use?

Use `git revert` for anything already pushed or shared: it adds new commits and pushes cleanly, so nobody loses work. Reserve `git reset --hard` for local, un-pushed commits only. If the agent already ran `git reset --hard` and force-pushed, recover the lost commits from `git reflog` before doing anything else.

Q: Production still serves the new behavior after I reverted and redeployed. Why?

Almost always build cache or CDN cache. Redeploy with `vercel deploy --prod --force` (or `VERCEL_FORCE_NO_BUILD_CACHE=1`) to defeat the build cache, then purge the edge with the Cloudflare `purge_cache` API. Verify with a cold `curl`, not your normal browser, which has its own cache.

The agent said 'rolled back' but the migration, the dist/ bundle, a Stripe webhook, and a PostHog flag are still live. Here's the six-domain checklist and the exact inverse commands that finish the job.

Published: May 17, 2026 Updated: Jun 17, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

You asked the agent to undo the last hour’s work. It ran git reset --hard HEAD~3 and reported “rolled back.” The code is back. But: the database has an extra column from the migration that ran 20 minutes ago, the dist/ folder still has the old bundle that Vercel already deployed, your Stripe webhook URL was changed in the dashboard and now points at a path that no longer exists, and the BILLING_V2_ENABLED feature flag is still on in PostHog.

TL;DR (fastest correct rollback): Don’t trust a single git command. Run git revert HEAD~3..HEAD --no-edit (note: HEAD~3..HEAD reverts the last 3 commits; HEAD~2..HEAD would only revert 2 — see Step 2), then walk the six-domain checklist below in this order: database migration, built artifacts + deploy, third-party config, env vars / flags, caches. Verify each domain instead of assuming the revert covered it.

“Rollback” only undoes what’s tracked in git. Everything else (generated artifacts, database state, third-party configuration, environment variables, feature flags, caches) has its own lifecycle. A complete rollback is a multi-domain checklist, not one git command. Below are the six domains agents typically leave dirty, the exact inverse commands, and the prompts that catch them.

Which bucket are you in?

If the symptom is “code is reverted but the app still behaves like the new version,” the difference is in one of these layers. Diagnose before you act.

Symptom after rollback	Likely dirty domain	Fastest confirm
App reads/writes a column the old code doesn’t know about; queries error	Database migration still applied	`prisma migrate status` shows the migration as applied
Browser shows new behavior; hard-refresh or `curl` shows old	Stale build artifacts or CDN/edge cache	`ls -la dist/` files newer than the rolled-back commit
External integration (Stripe/OAuth/DNS) calls a 404 path	Third-party config not reverted	Webhook/redirect URL in the vendor dashboard ≠ rolled-back code
Old code crashes on startup or ignores a setting	Env var / feature flag still set	`vercel env ls` shows a var absent from rolled-back `.env.example`
Some users see new data, others old, with identical code	CDN / Redis / browser cache	`curl` (cold) vs normal browser differ

Common causes

Ordered by hit rate, highest first.

1. The agent only reverted git-tracked code

git reset --hard reverts code. Doesn’t revert anything else. Agent confidently says “rolled back” because its concept of state is the repo, not the system.

How to spot it: After the rollback, the code on disk matches the target commit, but the running app behaves like the new code. Outputs/state somewhere ≠ code.

2. A database migration was applied and isn’t idempotent

The new feature added a column. The agent ran the migration. Now the code is rolled back but the column still exists, and the old code may break trying to read/write a schema it doesn’t know about.

How to spot it: pnpm db:status or prisma migrate status shows the migration as applied. Or your model definitions in the old code don’t match the live DB schema.

3. Built artifacts in `dist/`, `.next/`, `.svelte-kit/` weren’t rebuilt

Code is back to the old version, but the dev server is serving cached compiled output. Browser reload shows the new behavior. Worse, your deploy pipeline pushed dist/ to a CDN that’s still serving it.

How to spot it: Code says X, browser shows Y. ls -la dist/ shows files newer than the last code commit.

4. Third-party configuration was changed and not reverted

The new feature required updating webhook URLs in Stripe, redirect URIs in Google OAuth console, DNS records in Cloudflare. These live outside your repo entirely. Agent had no concept of them.

How to spot it: External integrations break post-rollback because they’re calling URLs that no longer exist in the rolled-back code.

5. Environment variables / feature flags still set

NEW_FEATURE_ENABLED=true is still in your .env.production or PostHog/LaunchDarkly. The new code is gone, so reading the flag has no effect — or the old code reads it and crashes because it doesn’t expect that flag at all.

How to spot it: vercel env ls (or your platform’s equivalent) shows variables that don’t appear in the rolled-back .env.example.

6. Cache (CDN, Redis, browser) still has new responses

Cloudflare cached the new API responses for 1 hour. Redis cached the new computed values for 24 hours. Even with code rolled back, users get the new data until the cache expires.

How to spot it: Hard-refresh / different browser / curl shows old behavior, but normal browser still shows new. Cache layer is the difference.

Shortest path to fix

Ordered by ROI. Step 1 is the framework; steps 2-6 are domain-specific recoveries.

Step 1: Enumerate every domain the original change touched

Before any rollback, ask the agent (or yourself) for a “blast radius” inventory:

For the last 3 commits, list every domain affected:
1. Git-tracked code: which files
2. Built artifacts: which dist/build dirs
3. Database: which migrations applied, what schema changes
4. Environment variables: which added/changed in which env
5. Feature flags: which created/flipped in which tool
6. Third-party config: which Stripe/OAuth/DNS/etc. changes
7. Caches: which CDN/Redis/browser caches may hold new state

For each, list the exact rollback action and its inverse command.

You’ll see immediately which domains need attention beyond git revert.

Step 2: Revert the code with `git revert`, not `git reset --hard`

git reset --hard is destructive: anyone who already pulled or branched loses work, and you have to force-push. Use git revert, which records new “undo” commits and pushes cleanly:

# Revert the last 3 commits, newest first.
# Range gotcha: A..B EXCLUDES A. So HEAD~3..HEAD reverts HEAD, HEAD~1, HEAD~2
# (three commits). HEAD~2..HEAD would revert only two. Count carefully.
git revert HEAD~3..HEAD --no-edit
git push origin HEAD

If any of those commits was a merge, add -m 1 to pick the mainline parent. To stage all reverts as one commit instead of three, use git revert --no-commit HEAD~3..HEAD then a single git commit. Then deploy the reverted code so production matches (Step 4).

Step 3: Roll back the database migration

If the migration was non-destructive (only added columns/tables), the old code usually still works against the wider schema, so leave it alone and just stop writing to the new column. If it was destructive (renamed/dropped columns, changed types), you need an explicit down migration.

Prisma has no prisma migrate down command. As of June 2026 the supported way to reverse a migration is to diff the current schema against migration history, run the generated SQL, then record the rollback so history stays consistent:

# 1. Generate the reverse SQL (diff edited schema -> migration history)
npx prisma migrate diff \
  --from-schema-datamodel prisma/schema.prisma \
  --to-migrations prisma/migrations \
  --script > down.sql

# 2. Apply it
npx prisma db execute --file ./down.sql --schema prisma/schema.prisma

# 3. Mark the migration rolled back so `migrate status` is clean again
npx prisma migrate resolve --rolled-back <migration_name>

For other stacks: Drizzle has no auto-down either, so apply a hand-written undo SQL file (drizzle-kit only generates forward migrations). Frameworks with built-in reversibility (Rails rails db:rollback, Knex knex migrate:rollback, Laravel php artisan migrate:rollback) can step back one batch.

Important: a down migration reverts schema, not data. Rows written by the new code while it was live are not undone. If the up migration backfilled or transformed data, restoring from a backup taken before the migration is often safer than a hand-rolled down.

Step 4: Rebuild and redeploy fresh artifacts

# Local
rm -rf dist .next .svelte-kit node_modules/.cache
pnpm install
pnpm build

# Push — `--force` (alias -f) skips Vercel's build cache so the
# revert can't be served from a stale cached build.
vercel deploy --prod --force
# or in CI, set the env var: VERCEL_FORCE_NO_BUILD_CACHE=1
# or re-promote a known-good prior deployment: vercel rollback

The build cache is the subtle trap here: a normal vercel deploy can reuse cached compiled output and quietly redeploy the new behavior even though the source reverted. --force (or VERCEL_FORCE_NO_BUILD_CACHE=1) is what guarantees a clean rebuild. If you just want production back on the last good release without rebuilding at all, vercel rollback re-promotes the previous deployment instantly.

Then purge the CDN so edge caches don’t keep serving the new responses (see Step 6).

Step 5: Revert third-party config and env vars

Walk each external system. Have a checklist (or build one):

- Stripe webhook URL: https://app.example.com/api/webhooks/billing (was: /api/v2/webhook)
- Google OAuth redirect: https://app.example.com/auth/callback
- DNS records modified: api.example.com CNAME
- PostHog flag BILLING_V2_ENABLED: turn off
- Vercel env vars added: BILLING_V2_API_KEY (remove)
- Cron schedules: hourly billing-sweep (deactivate)

These manual reverts are the ones agents miss most. If your team uses Terraform/Pulumi for any of this, git revert + terraform apply handles it; otherwise it’s clicking through dashboards.

Step 6: Force cache expiry where it matters

There is no wrangler cache purge command (as of June 2026 Wrangler has no CDN-purge subcommand). Purge Cloudflare’s edge cache via the REST API instead. Prefer a targeted purge over purge_everything so you don’t cold-start the whole zone:

# Targeted: purge only the affected URLs (best — minimal cold cache).
# Purge-by-prefix is capped at 30 prefixes per request.
curl -X POST "https://api.cloudflare.com/client/v4/zones/<zone_id>/purge_cache" \
  -H "Authorization: Bearer $CF_API_TOKEN" \
  -H "Content-Type: application/json" \
  --data '{"prefixes":["app.example.com/api/v2/"]}'

# Blunt fallback: purge the entire zone (every asset goes cold).
curl -X POST "https://api.cloudflare.com/client/v4/zones/<zone_id>/purge_cache" \
  -H "Authorization: Bearer $CF_API_TOKEN" \
  -H "Content-Type: application/json" \
  --data '{"purge_everything":true}'

# Redis: delete the specific keys the new code wrote (avoid FLUSHALL in prod)
redis-cli --scan --pattern "billing:v2:*" | xargs -r redis-cli del

# Browser: tell users to hard-refresh, or bump the asset hash/version in HTML

For user-facing caches with a short TTL (seconds to minutes), letting it expire naturally is usually fine. Force-invalidate only when the TTL is long or the stale data is harmful (wrong price, broken integration).

How to confirm it’s fixed

Don’t declare victory on “the code looks right.” Re-check each domain that could still be dirty:

Code: git log --oneline -5 shows the revert commits at the top; git status is clean.
Database: prisma migrate status (or your tool’s status) reports no pending or out-of-sync migration, and a quick query against the affected table returns the old schema.
Build/deploy: a cold request bypassing cache matches the old behavior: curl -sI https://app.example.com/<changed-path> (or add -H "Cache-Control: no-cache"); confirm the live deployment ID is the new one in vercel ls or the dashboard.
Third-party: open each vendor dashboard and confirm the webhook URL, OAuth redirect, and DNS record match the rolled-back code, not the feature you removed.
Flags/env: vercel env ls and your flag tool (PostHog, LaunchDarkly) show the new flag off and the new env var removed.
Cache: the same request in a clean browser profile (or incognito) now returns the old behavior, matching the cold curl.

If all six agree, the rollback is genuinely complete. If one still shows the new behavior, that’s your remaining dirty domain.

FAQ

Why did the agent say “rolled back” when it clearly wasn’t? Because to the agent, “state” means the git working tree. git reset --hard made the tree match the target commit, so from its point of view the job is done. It has no model of your database, CDN, vendor dashboards, or env store, so it can’t even see those domains, let alone revert them.

git reset --hard vs git revert — which should I use? Use git revert for anything already pushed or shared: it adds new commits and pushes cleanly, so nobody loses work. Reserve git reset --hard for local, un-pushed commits only. If the agent already ran git reset --hard and force-pushed, recover the lost commits from git reflog before doing anything else.

My migration only added a column. Do I even need to roll it back? Usually no. Additive (non-destructive) migrations leave the old code working against a wider schema. Leave the column in place and stop writing to it; reverse it later in a clean migration. Roll back immediately only if the new column has a NOT NULL constraint without a default, or a unique index, that the old code’s inserts now violate.

Production still serves the new behavior after I reverted and redeployed. Why? Almost always build cache or CDN cache. Redeploy with vercel deploy --prod --force (or VERCEL_FORCE_NO_BUILD_CACHE=1) to defeat the build cache, then purge the edge with the Cloudflare purge_cache API. Verify with a cold curl, not your normal browser, which has its own cache.

How do I stop this happening again? Before a risky change, have the agent produce a “blast radius” inventory (Step 1) listing every domain it will touch and the inverse command for each. Manage third-party config as code (Terraform/Pulumi) so reverting it is a git operation, and gate risky launches behind a feature flag you can flip off in seconds.

Prevention

Before any risky multi-domain change, take a “rollback inventory” — snapshot of every domain that may be touched
Make migrations idempotent and reversible — every up has a matching down you’ve tested
Use feature flags for risky launches so you can disable in seconds without rolling back code
Manage third-party config in code (Terraform, Pulumi) so revert is also a git operation
Document “rollback runbook” per feature — agents and humans both follow it
Don’t trust git reset --hard "rolled back" as a complete answer — always re-verify each domain

Tags: #Troubleshooting #Claude Code #Debug #Rollback

Which bucket are you in?

Common causes

1. The agent only reverted git-tracked code

2. A database migration was applied and isn’t idempotent

3. Built artifacts in dist/, .next/, .svelte-kit/ weren’t rebuilt

4. Third-party configuration was changed and not reverted

5. Environment variables / feature flags still set

6. Cache (CDN, Redis, browser) still has new responses

Shortest path to fix

Step 1: Enumerate every domain the original change touched

Step 2: Revert the code with git revert, not git reset --hard

Step 3: Roll back the database migration

Step 4: Rebuild and redeploy fresh artifacts

Step 5: Revert third-party config and env vars

Step 6: Force cache expiry where it matters

How to confirm it’s fixed

FAQ

Prevention

Related

Related Articles

Claude Code Bash Sandbox Blocks an Expected Command

Claude Code Not Loading Your Project CLAUDE.md (2026 Fix)

Claude Code Hook Blocks Edit Unexpectedly

Claude Code MCP Call Times Out (or Hangs) Repeatedly

Claude Code Output Truncated by Context Window

Claude Code Session Resume Loses Memory of Prior Work

3. Built artifacts in `dist/`, `.next/`, `.svelte-kit/` weren’t rebuilt

Step 2: Revert the code with `git revert`, not `git reset --hard`