You asked the agent to undo the last hour’s work. It ran git reset --hard HEAD~3 and reported “rolled back.” The code is back. But: the database has an extra column from the migration that ran 20 minutes ago, the dist/ folder still has the old bundle which Vercel deployed, your Stripe webhook URL was updated in the dashboard and now points at a path that no longer exists, and the BILLING_V2_ENABLED feature flag is still on in PostHog.
“Rollback” only undoes what’s in git. Everything else — generated artifacts, database state, third-party configuration, environment variables, feature flags, caches — has its own lifecycle. A complete rollback is a multi-domain checklist, not a git command. Below: the six domains agents typically leave dirty, plus the prompts that catch them.
Common causes
Ordered by hit rate, highest first.
1. The agent only reverted git-tracked code
git reset --hard reverts code. Doesn’t revert anything else. Agent confidently says “rolled back” because its concept of state is the repo, not the system.
How to spot it: After the rollback, the code on disk matches the target commit, but the running app behaves like the new code. Outputs/state somewhere ≠ code.
2. A database migration was applied and isn’t idempotent
The new feature added a column. The agent ran the migration. Now the code is rolled back but the column still exists, and the old code may break trying to read/write a schema it doesn’t know about.
How to spot it: pnpm db:status or prisma migrate status shows the migration as applied. Or your model definitions in the old code don’t match the live DB schema.
3. Built artifacts in dist/, .next/, .svelte-kit/ weren’t rebuilt
Code is back to the old version, but the dev server is serving cached compiled output. Browser reload shows the new behavior. Worse, your deploy pipeline pushed dist/ to a CDN that’s still serving it.
How to spot it: Code says X, browser shows Y. ls -la dist/ shows files newer than the last code commit.
4. Third-party configuration was changed and not reverted
The new feature required updating webhook URLs in Stripe, redirect URIs in Google OAuth console, DNS records in Cloudflare. These live outside your repo entirely. Agent had no concept of them.
How to spot it: External integrations break post-rollback because they’re calling URLs that no longer exist in the rolled-back code.
5. Environment variables / feature flags still set
NEW_FEATURE_ENABLED=true is still in your .env.production or PostHog/LaunchDarkly. The new code is gone, so reading the flag has no effect — or the old code reads it and crashes because it doesn’t expect that flag at all.
How to spot it: vercel env ls (or your platform’s equivalent) shows variables that don’t appear in the rolled-back .env.example.
6. Cache (CDN, Redis, browser) still has new responses
Cloudflare cached the new API responses for 1 hour. Redis cached the new computed values for 24 hours. Even with code rolled back, users get the new data until the cache expires.
How to spot it: Hard-refresh / different browser / curl shows old behavior, but normal browser still shows new. Cache layer is the difference.
Shortest path to fix
Ordered by ROI. Step 1 is the framework; steps 2-6 are domain-specific recoveries.
Step 1: Enumerate every domain the original change touched
Before any rollback, ask the agent (or yourself) for a “blast radius” inventory:
For the last 3 commits, list every domain affected:
1. Git-tracked code: which files
2. Built artifacts: which dist/build dirs
3. Database: which migrations applied, what schema changes
4. Environment variables: which added/changed in which env
5. Feature flags: which created/flipped in which tool
6. Third-party config: which Stripe/OAuth/DNS/etc. changes
7. Caches: which CDN/Redis/browser caches may hold new state
For each, list the exact rollback action and its inverse command.
You’ll see immediately which domains need attention beyond git revert.
Step 2: Revert the code with git revert, not git reset --hard
git reset --hard is destructive — others on the branch lose work. Use revert which creates new commits:
# For the last 3 commits, in reverse order
git revert HEAD~2..HEAD --no-edit
git push origin HEAD
Then deploy the reverted code so production matches.
Step 3: Roll back the database migration
If the migration was non-destructive (only added columns/tables), the old code may still work — leave the schema alone, just stop using it. If it was destructive (renamed/dropped columns), you need an explicit down migration:
# Prisma
pnpm prisma migrate resolve --rolled-back <migration_name>
# then create a reverse migration
pnpm prisma migrate dev --name revert_<original_name>
# Drizzle / raw SQL
pnpm db:rollback # if your stack has it
# else apply a hand-written undo migration
If you have backups, restoring from a backup taken before the migration is sometimes simpler than a hand-rolled down migration.
Step 4: Rebuild and redeploy fresh artifacts
# Local
rm -rf dist .next .svelte-kit node_modules/.cache
pnpm install
pnpm build
# Push
vercel deploy --prod
# or trigger your CI redeploy
# Invalidate CDN
curl -X POST "https://api.cloudflare.com/client/v4/zones/<zone>/purge_cache" \
-H "Authorization: Bearer $CF_API_TOKEN" \
-d '{"purge_everything":true}'
Don’t trust “the deploy will pick up the revert” — force a clean rebuild + cache purge.
Step 5: Revert third-party config and env vars
Walk each external system. Have a checklist (or build one):
- Stripe webhook URL: https://app.example.com/api/webhooks/billing (was: /api/v2/webhook)
- Google OAuth redirect: https://app.example.com/auth/callback
- DNS records modified: api.example.com CNAME
- PostHog flag BILLING_V2_ENABLED: turn off
- Vercel env vars added: BILLING_V2_API_KEY (remove)
- Cron schedules: hourly billing-sweep (deactivate)
These manual reverts are the ones agents miss most. If your team uses Terraform/Pulumi for any of this, git revert + terraform apply handles it; otherwise it’s clicking through dashboards.
Step 6: Force cache expiry where it matters
# Cloudflare full purge
wrangler cache purge --zone <zone-id>
# Redis specific keys (or full flush in dev)
redis-cli --scan --pattern "billing:v2:*" | xargs redis-cli del
# Browser: tell users to hard-refresh, or bump asset version in HTML
For user-facing caches, often the right move is “let it expire naturally” if TTL is short (minutes). For longer TTLs, force-invalidate.
Prevention
- Before any risky multi-domain change, take a “rollback inventory” — snapshot of every domain that may be touched
- Make migrations idempotent and reversible — every
uphas a matchingdownyou’ve tested - Use feature flags for risky launches so you can disable in seconds without rolling back code
- Manage third-party config in code (Terraform, Pulumi) so revert is also a git operation
- Document “rollback runbook” per feature — agents and humans both follow it
- Don’t trust
git reset --hard "rolled back"as a complete answer — always re-verify each domain
Related
- AI rollback workflow
- Multi-agent conflict
- Claude Code beginner guide
- Claude Code workflow
- Claude Code project setup
- Claude Code Creates a Pile of Unused Helpers
- Claude Code Accidentally Committed a Secret
- Claude Code Ignores Your Audit Report
- Claude Code Misunderstands Your Project Architecture
- Claude Code Permissions Prompt Loop
- Claude Code Skips or Weakens Failing Tests
- Claude Code Stops Mid-Task Due to Token / Usage Limits
- Claude Code Stuck After Partial Execution
- Claude Code Tool Execution Hangs Without Timeout
- Claude Code MCP Call Times Out Repeatedly
- Claude Code Bash Sandbox Blocks an Expected Command
- Claude Code Session Resume Loses Memory of Prior Work
- Claude Code Statusline Custom Script Errors or Hangs