You ran codex audit (or pasted “audit my project”) and got back 50+ bullets: “consider using const instead of let”, “API key should be in env var”, “this function is 200 lines long”, “consider adding tests for edge cases”. By the time you finish reading, you have no idea which 3 things to fix this afternoon. The report becomes a guilt artifact you never act on.
This is not a model problem — it’s an audit-prompt problem. A useful audit has one dimension, one scope, capped output, and file-anchored items. Below: how to spot why your audit went broad, and the prompt template that produces an actionable 10-item list every time.
Common causes
Ordered by hit rate, highest first.
1. The prompt was “audit the project”
Open-ended audit prompts produce open-ended reports. “Audit” with no dimension forces Codex to guess what you care about — and it hedges by listing everything.
You: audit my project
Codex: [returns 50 bullets across security, style, perf, types, tests, docs, accessibility]
How to spot it: Look at your original prompt. If it doesn’t name one dimension (security OR perf OR types), the report will mix all of them.
2. No scope — Codex audited every directory
“Audit” without a path means Codex walks the whole tree. A monorepo audit returns items from apps/marketing/, packages/ui/, scripts/, docs/ all jumbled together — even though you only care about the API layer.
How to spot it: Group the bullets by file path. If they span 4+ top-level folders, scope was missing.
3. No severity rating — every item looks equal
Without “rate each item 1–5 for severity,” Codex returns items in discovery order, not impact order. A missing semicolon sits next to a SQL injection risk.
How to spot it: Read the first 5 items. If a critical-looking item appears alongside a cosmetic one with no flag, severity wasn’t requested.
4. No “already fixed” filter on repeat audits
Second-round audits re-report items you fixed in round 1, because Codex doesn’t know what changed. You wade through the same list, marking 40% as “done” by hand.
How to spot it: Diff this audit against the previous one. If 30%+ items are identical wording, you didn’t pass the previous list as a “skip” set.
5. No file:line anchor
Bullets like “consider improving error handling in the auth flow” can’t be acted on without 20 minutes of file-hunting. The report looks long because it’s spread across many vague paragraphs instead of pointing at concrete lines.
How to spot it: Count items with file.ts:42 style anchors. If under 50% of items have line numbers, you can’t triage without re-reading the codebase.
6. Mixed dimensions — style + perf + security in one pass
When you ask Codex to find “everything wrong,” it returns the union. Each dimension uses different judgment heuristics, so the report reads as inconsistent (a missing comment given the same weight as a missing CSRF check).
How to spot it: Tag each bullet with one of security, perf, style, types, tests, docs. If the histogram is flat across 4+ tags, dimensions were mixed.
Shortest path to fix
Ordered by ROI. Steps 1–3 turn a broad audit into an actionable 10-item list.
Step 1: Pick exactly one dimension
Choose from this list, one at a time:
| Dimension | When to run |
|---|---|
| security | Before a release; after touching auth, payments, or user input |
| perf | When p95 latency or bundle size regresses |
| types | After a TypeScript upgrade or large refactor |
| tests | Quarterly, or after a hotfix that lacked test coverage |
| style | Once per quarter; lowest priority |
| docs | Before onboarding a new contributor |
Run security audits as their own pass — don’t bundle them with style.
Step 2: Pick one narrow scope
A useful scope is one directory or one feature, not “the project.” Examples:
src/api/auth/— one feature modulesrc/components/billing/— one user-facing flowmigrations/*.sql— one file class
If your codebase is large, run audits in slices and merge findings into your tracker, not into one mega-report.
Step 3: Use the constrained audit prompt
Paste this template, filling in the bracketed slots:
Audit only [SCOPE] for [DIMENSION] issues.
Constraints:
- Return at most 10 items, sorted by severity (P0 → P3).
- Each item must include:
- severity (P0 = ship-blocker, P1 = before release, P2 = next sprint, P3 = nice-to-have)
- file:line (or file range)
- one-sentence problem statement
- one-sentence fix
- Skip cosmetic items (formatting, naming) unless they hide a bug.
- Skip items already addressed in: [PASTE PREVIOUS AUDIT, or "none"].
- Do not propose architectural rewrites.
Output as a markdown table with columns: severity | file:line | problem | fix.
Example output you should expect:
| Sev | File:Line | Problem | Fix |
|---|---|---|---|
| P0 | src/api/auth/login.ts:42 | Password compared with `==` not constant-time | Use `crypto.timingSafeEqual` |
| P0 | src/api/auth/session.ts:118 | JWT signed with HS256 + secret in env, no rotation | Add `kid` header, rotate quarterly |
| P1 | src/api/auth/reset.ts:23 | Reset token TTL is 24h, RFC recommends 1h | Lower `TOKEN_TTL` to 3600 |
| P2 | src/api/auth/middleware.ts:67 | Rate limit per-IP, not per-account | Add `accountId` to the limit key |
Step 4: Triage in the issue tracker, not the audit file
Open each P0 and P1 as a real issue with the file:line in the title. P2/P3 go into a single “audit backlog” ticket. The markdown audit file is read-once and deleted.
# Bulk-create issues from a Codex audit (gh CLI)
gh issue create -t "P0: timing attack in login.ts:42" -b "Codex audit 2026-05-22"
Step 5: For repeat audits, pass the previous list as “skip”
Round 2 prompt:
Audit src/api/auth/ for security issues.
SKIP items already fixed:
- timing-safe password compare (login.ts:42)
- JWT rotation (session.ts:118)
- reset token TTL (reset.ts:23)
Same constraints as round 1.
You’ll get a shorter, fresher list instead of re-reading the same 50 bullets.
Prevention
- Keep one prompt template per dimension in
prompts/audit-security.md,prompts/audit-perf.md, etc. — never compose ad-hoc audit prompts - Cap audit output at 10 items per run; if more issues exist, schedule a follow-up run for the next slice
- Always require severity, file:line, and one-line fix — reject reports that lack any of the three
- Track audits in the issue tracker, not in markdown files — markdown rots, tickets get closed
- Re-audit security per release, perf per regression, style per quarter — match cadence to dimension
- Pass the previous round’s findings as a “skip” set so round N stays short
Related
- Codex review quality feels shallow
- Codex test suggestions are too generic
- Codex makes unsafe assumptions
- Codex beginner guide
- Codex code review workflow
- Codex vs Claude Code
Tags: #Codex #Coding agent #Troubleshooting #Debug #Broad audit