How to Review AI Diffs Efficiently

A 200-line diff isn't safe just because it compiles. Read it like a senior would.

What this covers

A 200-line AI diff that compiles, passes tests, and looks plausible is the single highest-risk artifact in a modern codebase - because the surface signals all say “merge me.” This guide shows the reading order a senior engineer actually uses for AI patches, what to look at first, and the specific failure modes (silent renames, lost branches, fake test fixes) that catch reviewers off-guard.

Who this is for

Anyone reviewing AI-generated PRs - whether the author was a teammate or yourself an hour ago. Especially useful if your team has just turned on agentic coding and the PR queue suddenly looks 3x longer.

When to reach for it

Before merging any PR that an AI agent touched. Also before approving your own AI-assisted PRs - “self-review” is when most embarrassing AI changes ship, because the human assumes they remember what they asked for.

Before you start

  • Pull the branch locally; don’t review on the GitHub web UI alone. You need to be able to run git log, git blame, and the tests.
  • Confirm the PR has a description. “I asked Claude to fix the bug” is not a description. Push back if needed.
  • Have the failing test or repro the AI claims to fix in front of you. If there isn’t one, the first thing to ask is what the AI was actually solving.
  • Disable diff-folding in your editor. AI agents love to slip subtle deletions into collapsed regions.

Step by step

  1. Read deletions first. Sort the diff or use git diff --stat to find the file with the most red. Deletions are how AI agents quietly drop branches, error handling, or whole feature flags they didn’t understand.
  2. Read renames second. git log --follow and git diff -M show real renames. AI agents sometimes “rename” by deleting and recreating, which loses blame history and can hide a behavior change inside a “rename” commit.
  3. Read new logic last. New code is the most fun to read and the easiest to nod through. Save it for when you’re warmed up.
  4. Run the test the AI claims it fixes. Check it out from main first, confirm it fails, then check out the branch and confirm it passes. AI agents sometimes “fix” a test by editing the assertion.
  5. git log <branch> --not main to see every commit. Look for commits like “fix typo” that touch business logic - those are panic-fix commits worth a second look.

What AI gets wrong that humans usually don’t

  • Silent removal of if (err) return blocks because “the happy path was cleaner.”
  • Loop conditions changed by one - < becomes <=, off-by-one shipped as a “clarity” commit.
  • Time-zone handling reverted to system default because the agent didn’t see the TZ test.
  • Auth checks moved one layer down where they look right but no longer cover the public route.
  • Mocked-out network calls in tests that originally hit real (intended) endpoints.
  • “Refactored” config files where ordering matters (Express middleware, Webpack loaders).

deletions -> renames -> new code -> run the claimed-fixed test from main and from the branch -> read git log for surprise commits -> approve or push back. The whole thing should take 5-10 minutes for a 200-line patch. If it takes longer, the PR is too big - ask for it to be split.

A 60-second sanity script

Drop this in your shell to skim AI patches fast:

git fetch origin
git diff --stat origin/main...HEAD | sort -k3 -n -r | head
git log origin/main..HEAD --oneline
git diff origin/main...HEAD -- '*.test.*' '*spec*'   # see test changes only
git diff origin/main...HEAD | grep -E '^-' | head -40 # see deletions first

If the last line shows deleted try, catch, if (err), or return - stop and read context.

FAQ

  • Should I trust green CI? - Treat it as necessary, not sufficient. CI catches regressions you wrote tests for; AI failure modes often live outside that.
  • How long should review take? - Roughly 1 minute per 30 lines of meaningful diff; anything bigger should be split.
  • What if the AI wrote the tests too? - Read the test diff with extra suspicion. Confirm at least one test asserts a value the AI couldn’t have inferred from the prompt.
  • Should I ask the AI to review its own diff? - Useful as a second opinion, never as the only opinion. It’ll often miss the very thing it just did wrong.
  • Is it fine to approve with comments? - For human PRs yes; for AI PRs, send it back. The agent will gladly fix the comments in one round.
  • What about giant generated files (lockfiles, schemas)? - Collapse them, but spot-check the version numbers actually changed match what the agent claimed.

Common mistakes

  • Reading new code first - you get hooked on intent and miss the deletions.
  • Treating “compiles” as “correct” - many bugs typecheck fine.
  • Skipping the failing-test repro - you lose the only ground truth you had.
  • Reviewing on the web UI only - you can’t run anything, so you can’t disprove anything.
  • Merging an AI PR with no description - future-you has zero context when it breaks.
  • Approving in one round when you have nits - the agent rewrites infinitely for free; use it.

Tags: #AI coding #Tutorial