“Why is the app slow?” is the perf question that gets the worst answers. A good perf-regression prompt names the metric (p50 / p99 / TTFB / LCP), the diff window, and forbids speculation — only file:line evidence and benchmarks.
Who this is for
On-call engineers debugging a perf alert, leads chasing a slow PR, indie devs trying to pass Core Web Vitals before launch.
When not to use these prompts
Don’t use these without a baseline metric. “Slow” without numbers wastes everyone’s time. Don’t use them on dev-only perf — measure prod.
Prompt anatomy / structure formula
Every perf-regression prompt should carry six elements:
- Role: who AI plays (SRE / release captain / staff engineer / QA lead).
- Context: stack / branch / failing logs / diff / dashboard URL.
- Goal: one concrete deliverable — root cause, checklist, plan, ticket list, runbook.
- Constraints: what AI MUST NOT do (don’t auto-fix, don’t hallucinate file paths).
- Output format: numbered findings, markdown table, JSON, unified diff, runnable code.
- Examples / signal: 1-2 “good” output examples, or counter-examples.
Best for
- Diff-window p99 regression triage
- Bundle-size regression on a PR
- N+1 hunting in a slow endpoint
- React render-storm investigation
- DB query plan change detection
12 copy-ready prompt templates
1. p99 diff triage
p99 latency on `{endpoint}` jumped {fromMs}ms → {toMs}ms between `{oldSha}` and `{newSha}`. List 5 likely causes in priority order. For each: (a) suspicion strength, (b) one file:line or query to inspect, (c) one cheap check. Don't propose fixes yet.
Variables to swap: endpoint, fromMs, toMs, oldSha, newSha
2. PR perf risk scan
Scan this PR diff for perf risks: (1) New synchronous I/O in hot path, (2) Loops calling DB / fetch inside, (3) New large dep imported eagerly, (4) React re-render expansion (new context, unstable deps), (5) Missing index for new query. file:line + severity.
3. N+1 hunter
In the function `{functionName}` at `{filePath}`, identify N+1 patterns: (a) Loops calling DB / fetch, (b) Promise.all over single-item fetches, (c) Recursive accessors hitting ORM lazy fields. For each: rewrite as a single batched call, with code.
Variables to swap: functionName, filePath
4. Bundle size regression
Bundle grew from {oldKb} → {newKb} KB. Identify the top 3 contributors: (1) New direct deps and their size, (2) Tree-shake failures (default imports from a library that ships ESM), (3) Polyfill bloat (target browser change?). Output: a fix per item.
Variables to swap: oldKb, newKb
5. React render-storm diagnosis
Component `{component}` re-renders {nRenders} times per interaction. Diagnose: (1) Unstable prop identity (objects / arrays created in render), (2) Context provider value not memoized, (3) Parent state too coarse, (4) useEffect dep that changes each render. Output: cause + minimal fix.
Variables to swap: component, nRenders
6. DB query plan regression
Query plan for `{query}` changed: was index seek + nested loop, now is sequential scan + hash join. Diagnose: (1) Statistics stale (ANALYZE recently?), (2) Cardinality estimate off, (3) New column / index hint mismatch, (4) Parameter sniffing. Output: most likely + ANALYZE / pg_stat_user_indexes command to confirm.
Variables to swap: query
7. Cold start regression
Serverless function `{fnName}` cold start went {fromMs} → {toMs} ms. Diagnose: (1) Bundle size grew, (2) New top-level imports, (3) New connection at boot, (4) New env var fetch. Output: top 3 by likelihood + a 5-min experiment.
Variables to swap: fnName, fromMs, toMs
8. TTFB / LCP regression
LCP on `{pagePath}` went {fromMs} → {toMs} ms. Walk the waterfall: (1) Server response time, (2) Critical CSS / JS blocking, (3) Image / font payload, (4) Layout shift forcing re-render. Pick the dominant cause.
Variables to swap: pagePath, fromMs, toMs
9. Memory growth regression
Service RSS grew from {oldMb} → {newMb} MB. Diagnose: (1) New cache without eviction, (2) Closures retaining large objects, (3) Listener leaks (no removeListener on unmount / restart), (4) Buffer pools sized too large. file:line.
Variables to swap: oldMb, newMb
10. Slow-test regression
Test suite went from {fromMin} → {toMin} min. Identify: (1) Specific test files that grew, (2) Setup / teardown bloat, (3) Real timer / sleep introduced, (4) Parallelism reduction. Output: 3 specific cleanups.
Variables to swap: fromMin, toMin
11. Perf-fix benchmark plan
Before fixing, design a benchmark: (1) Minimal reproducible scenario, (2) Metric (median + p99), (3) Sample size, (4) Baseline run command. After fix, re-run same benchmark. Don't fix without baseline numbers.
12. “Slow but acceptable” decision
A regression is real but small ({deltaMs}ms). Decide: (1) Is the absolute number above target? (2) Is the user impact measurable (conversion / bounce)? (3) Is the fix more expensive than the regression? Output: SHIP / FIX / REVERT + one-line rationale.
Variables to swap: deltaMs
Common mistakes
- Optimising without baseline numbers.
- Confusing p50 and p99 — they have different fixes.
- Trusting dev-only profiles — prod hot paths differ.
- Adding caches before fixing the actual N+1.
- Bundle splitting without measuring what was preloaded vs lazy.
- Memoising everything in React — adds overhead.
- Investigating before reading the deploy diff — perf regressions are usually code, not infra.
How to push results further
- Always anchor to a metric + sample size + diff window.
- p99 fixes are different from p50 fixes — separate them.
- For React, capture the Profiler trace, don’t guess from logs.
- For DB, get the query plan before and after with
EXPLAIN ANALYZE. - Run benchmarks 3 times before declaring victory — variance hides regressions.
- Cache as a last resort, not first.
- Document the regression and fix in the post-mortem so the next person doesn’t re-introduce it.
Practical depth notes
Use these prompts as starting points, not final answers. For Performance Regression Audit Prompts: 12 Templates for p99 Triage, the useful extra work is to replace every generic placeholder with a real constraint: audience, channel, length, brand voice, examples to imitate, and examples to avoid. Run at least two versions with different constraints, then compare the outputs side by side instead of accepting the first polished response.
A good result should pass three checks: it is specific enough that another person could reuse it, it avoids vague praise or filler, and it gives you an editable artifact rather than a broad suggestion. If the output feels generic, add one concrete reference, one forbidden pattern, and one measurable success criterion before rerunning the prompt.
FAQ
- How big a regression matters?: Anything that pushes p99 above your SLO target. Below SLO, evaluate against fix cost.
- Should I optimise before launch?: Hit your Core Web Vitals targets, then ship. Premature optimisation past targets wastes time.
- Is React.memo always safe?: No — memo with unstable props (objects / arrays / callbacks) makes things worse.
- How do I find DB index gaps?: Use
pg_stat_user_indexesfor unused indexes andpg_stat_user_tablesfor seq-scan-heavy tables. - AI can read flamegraphs?: It can interpret text traces and profiler JSON. Visual flamegraphs need a vision model.
- When to invest in a perf budget CI gate?: Once a regression has reached prod twice. Before that, manual checks are fine.
Related
- Performance optimization prompts
- Database schema review prompts
- Code review prompts
- Full repository audit prompts
- Accessibility Regression Audit Prompts: 12 Templates Beyond axe-core
- Coding & Developer Prompts hub
Tags: #Prompt #Coding #Performance #Audit