Full Repository Audit Prompts: 15 Templates for Whole-Project Review

Whole-repo audit prompts for Claude Code / Codex — architecture smells, dead code, security risks, dependency drift, test coverage gaps, in one structured pass.

A whole-repo audit isn’t “tell me what’s wrong” — that prompt yields generic advice. A good audit prompt names the dimensions (architecture, security, test coverage, deps, perf, docs), forces evidence (file:line), and constrains output (markdown table or numbered list with severity). These 15 templates cover the audit angles your repo actually needs.

Who this is for

Tech leads doing onboarding audits, founders preparing for due-diligence, indie devs before launch, senior engineers inheriting a codebase.

When not to use these prompts

Don’t use these for tiny scripts (< 500 LOC) — the overhead is bigger than the payoff. Also don’t use on closed-source repos where you can’t share the context the audit needs.

Prompt anatomy / structure formula

A whole-repo audit prompt should always carry six elements:

  • Role: who the AI plays (senior reviewer / SRE / staff engineer).
  • Context: repo / framework / runtime versions / files in scope.
  • Goal: one concrete deliverable — review notes, diff, plan, checklist.
  • Constraints: things AI MUST NOT do (don’t touch X, don’t silently rename, don’t auto-format).
  • Output format: numbered findings, markdown table, JSON, or unified diff.
  • Examples / signal: 1-2 examples of “good” output, or what bad output looks like.

Best for

  • Pre-launch hardening sweep
  • Onboarding audit when inheriting a codebase
  • Quarterly tech-debt review
  • Due-diligence preparation
  • Pre-refactor baseline assessment

15 copy-ready prompt templates

1. Whole-repo health snapshot

Best as the first pass when you’ve just opened a strange repo.

You are a staff engineer doing a 30-minute audit of this repository. Produce a 1-page report with these sections: (1) Stack & framework summary in 3 sentences, (2) Three architecture smells you can spot with evidence (file:line), (3) Three security risks (auth / data / secret handling), (4) Test coverage signal (yes/no per top-level dir), (5) Top 5 follow-ups ranked by impact / effort. Do not propose rewrites — only diagnose.

Variables to swap: repo files (Claude Code reads automatically) — none needed unless using a chat tool

Optimization: If the model wants to dive too deep, add: “Skip implementations. We are mapping the territory, not refactoring.”

2. Architecture-only audit

Audit this repo for ARCHITECTURE only. Ignore style / naming. Report: (1) Top 3 layering violations with file:line, (2) Modules with > 5 incoming deps (god-objects), (3) Any "data flow surprise" where state mutates across module boundaries, (4) One paragraph: "If I had to redraw the boxes-and-arrows, here is the cleaner version."

3. Dependency drift audit

Read package.json (and lockfile if present). Report: (1) Direct deps that are > 2 major versions behind latest, (2) Any deprecated / abandoned packages, (3) Duplicate logical deps (e.g., axios + fetch wrapper + got), (4) Native bindings or post-install scripts that warrant attention, (5) Upgrade roadmap: which to bump now / next sprint / never.

Optimization: Pair with: “Mark any of these that have known CVEs against the version we pin.”

4. Dead-code & orphan audit

Find dead and orphaned code: (1) exported functions / components that are never imported, (2) routes / pages that are unreachable from the main router, (3) env vars referenced in code but never set, (4) feature flags that have been "on" for > 6 months. Return a table: kind | path | evidence | safe to delete? (yes/no/maybe).

5. Test coverage qualitative audit

Don't run coverage tools. Instead, do a qualitative test audit: (1) Which critical paths have ZERO tests? Name them, (2) Which existing tests are tautological (testing mocks of mocks)? File:line, (3) Where is the test pyramid inverted (too many e2e, too few unit)? (4) Suggest 5 highest-ROI tests to write next.

6. Security risk audit

Audit only for SECURITY: (1) Unvalidated user input reaching DB / shell / template / fetch, (2) Secrets in code or in .env.example, (3) AuthN/AuthZ gaps — any route lacking auth middleware? Any role check missing? (4) Logging that leaks PII / tokens, (5) CORS / CSRF posture. For each finding: file:line, severity (Critical / High / Med), one-line fix sketch.

7. Performance hot-spot audit

Audit for PERFORMANCE without running benchmarks. Find: (1) N+1 patterns in DB calls (file:line), (2) Synchronous I/O in hot paths, (3) Missing caches where the same fetch repeats across requests, (4) Bundle bloat suspects (large deps imported eagerly), (5) Re-render storms (React only): components missing memo / unstable deps in useEffect / context churn.

8. Documentation audit

Audit project docs: (1) Does README explain "what + why" or just commands? (2) Is the run-locally path actually current? (3) Are public functions / exported types missing TSDoc / docstrings? List 10 worst offenders, (4) Are env vars documented? (5) Suggest 5 doc sections that would make onboarding 50% faster.

9. Type-safety audit (TS / Python typing)

Audit for type-safety: (1) Count `any` / `as unknown as` / `// @ts-ignore` (or Python `# type: ignore`), (2) List API boundary types that come from `any`, (3) Functions with > 4 args missing a typed args-object, (4) Type definitions duplicated across the repo. Return file:line evidence.

10. Error-handling audit

Audit ERROR HANDLING only: (1) try/catch blocks that swallow errors silently, (2) `catch (e) {}` empty handlers, (3) Promise chains missing `.catch`, (4) API routes that 500 instead of returning typed errors, (5) Background jobs lacking retry / dead-letter. Each finding: file:line + one-line fix sketch.

11. Database & schema audit

Audit DB code: (1) Tables without explicit indexes on FKs, (2) Migrations that drop / rename columns without backfill, (3) ORM `.findAll()` without limits, (4) Transactions missing for multi-row writes, (5) Soft-delete columns referenced unevenly. Return findings with file:line and severity.

12. Logging & observability audit

Audit OBSERVABILITY: (1) Any service-critical path with zero logs? Name it, (2) Logs that include PII or secrets, (3) Inconsistent log shape (some JSON, some `console.log`), (4) Metrics / counters missing on auth-fail / payment-fail / external-API calls, (5) Trace propagation gaps. Suggest the 5 highest-ROI log/metric additions.

13. Build & tooling audit

Audit BUILD / TOOLING: (1) Steps that take > 60s and can be cached, (2) Lint config inconsistencies between root and packages, (3) CI jobs that never fail (warning-only that should be error), (4) Pre-commit hooks that are skipped via `--no-verify` in scripts, (5) Node / Python / Go version mismatch between local / CI / Docker.

14. Cross-language repo audit

This repo mixes {languages}. Audit cross-language boundaries: (1) Where do TS / Python (or whichever pair) types diverge? List schemas, (2) Are message contracts versioned? (3) JSON keys casing inconsistencies (camelCase ↔ snake_case)? (4) Build-order dependencies between sub-packages.

Variables to swap: languages — e.g., “Next.js + Python FastAPI + Go workers”

15. Repo audit → action ticket list

Run last; converts findings into tickets.

Take all audit findings above and turn them into a prioritized ticket list. For each: (1) Title, (2) One-paragraph description, (3) Acceptance criteria (3 bullets), (4) Estimated effort (S / M / L), (5) Risk if not done. Group by: Now (this sprint) / Next (next quarter) / Later. Output as markdown table.

Common mistakes

  • Asking “what’s wrong with this repo?” without naming dimensions — output is generic.
  • No output format constraint — you get prose, not a triage list.
  • Letting AI propose rewrites in the same pass as the audit — you lose the diagnostic clarity.
  • No severity scale — every finding looks equally urgent.
  • Forgetting to ask for file:line evidence — you can’t verify the claims.
  • Doing all dimensions in one prompt — output dilutes; better to run dimension-by-dimension.
  • Re-running the audit after each fix — instead, fix in batches, then re-audit.

How to push results further

  • Run audits in separate threads by dimension. Don’t merge architecture + security + perf into one prompt — context dilutes findings.
  • Always demand file:line evidence. If the model can’t provide it, the finding is hallucinated.
  • Add a severity enum (Critical / High / Med / Low) in your prompt — forces ranking.
  • For long repos, ask AI to first list directories it considers risky, then audit those. Cuts noise.
  • Use Claude Code’s Read over fed-in file dumps — the agent picks files as needed.
  • Save the audit as a markdown file in /docs/audits/ and date it — next audit can diff against it.
  • Pair an architecture audit with a non-coder explainer prompt: “Summarize for a PM in 8 sentences.” Spot-checks clarity.

FAQ

  • Should I run this on every PR?: No — too noisy. Run quarterly, before launches, on inheritance, or before major refactors.
  • Will AI miss things?: Yes. Treat the audit as 80% — pair with: dependency CVE scanner, lint rules, and at least one human review pass.
  • Can I use these on closed-source repos?: Only with on-prem / private deployment (Claude Code with VPC, Codex on-prem, or Bedrock). Never paste closed-source into a public chat.
  • How long should an audit take?: For a 50K-LOC repo: 30 min to run, 1-2 hours to triage findings. Budget less if you keep dimensions separate.
  • What if findings conflict with my architecture decisions?: Flag them as out-of-scope rather than ignoring — note the rationale so next audit doesn’t re-flag.
  • How do I keep audits actionable?: Run template 15 (action-ticket conversion) — without it, audits become read-only documents.

Tags: #Prompt #Coding #Code review #Audit #Claude Code