“Review my architecture” yields textbook patterns that don’t map to your code. A good architecture review prompt names the dimension (layering, deps, boundaries, data flow), demands file:line evidence, and forbids rewrite suggestions in the same pass. These 15 templates each interrogate a different structural angle.
Who this is for
Tech leads doing architecture audits before a refactor, staff engineers reviewing a junior team’s design, founders preparing diligence, anyone inheriting an unfamiliar codebase.
When not to use these prompts
Skip these for greenfield design — use design-doc prompts instead. Also skip for sub-200-LOC scripts where “architecture” is just one file.
Prompt anatomy / structure formula
Every architecture review prompt should carry six elements:
- Role: who the AI plays (architect / SRE / QA lead / release captain).
- Context: repo / framework / runtime versions / files or diff in scope.
- Goal: one concrete deliverable — review notes, plan, checklist, test file, handoff doc.
- Constraints: what AI MUST NOT do (don’t rewrite, don’t auto-format, don’t guess versions).
- Output format: numbered findings, markdown table, JSON, unified diff, or runnable code.
- Examples / signal: 1-2 examples of “good” output, or what bad output looks like.
Best for
- Pre-refactor structural assessment
- Inheritance audit for unfamiliar codebases
- Architecture decision record (ADR) backfill
- Quarterly tech-debt review
- Diligence and acquisition reviews
15 copy-ready prompt templates
1. Layering violation hunt
Run first — most architecture rot starts here.
You are a staff engineer reviewing this codebase for LAYERING violations only. Identify imports that cross layers in the wrong direction (UI importing infrastructure, domain importing framework, etc.). For each: file:line, the violating import, the layer rule it breaks, and a one-line refactor sketch. Do not propose rewrites. Return a markdown table with severity (Critical / High / Med).
Variables to swap: repo files (Claude Code auto-reads); for chat tools paste the package/module tree
Optimization: Append: “If layers are not explicit, first infer the intended layer model from folder names, then judge violations against that inferred model.”
2. Dependency cycle detection
Scan this repo for circular dependencies between modules / packages. Output: (1) cycle as `A -> B -> C -> A`, (2) the import that creates each edge (file:line), (3) which edge is the cheapest to break, (4) one-line break strategy (interface, event, move type). Ignore intra-file cycles.
3. Module boundary leak audit
Audit MODULE BOUNDARIES. For each top-level module, list: (1) what types it exports (public API), (2) types that leak through but are internal (e.g., DB rows surfacing in HTTP handlers), (3) internal types reached via deep imports `module/internal/...`. Flag every boundary leak with file:line.
4. God-object / hub-module detection
Find HUB modules: any module with > 5 incoming imports OR > 10 outgoing imports. For each: list the fan-in/out count, the responsibilities tangled inside, and propose a split into 2-3 cohesive submodules. Do not write the refactor — only the split plan.
5. Data flow surprise audit
Trace DATA FLOW for the 3 most important entities (infer from naming if not told). For each: (1) where it is created, (2) where it is mutated, (3) where it is read across module boundaries. Flag any "surprise mutation" — state changing in a module that the entity doesn't belong to.
Variables to swap: entity names (optional — model can infer)
6. Hexagonal / ports-and-adapters check
Evaluate this codebase against ports-and-adapters: (1) Is domain logic isolated from frameworks? Cite evidence, (2) Are external systems (DB, queue, HTTP) reached through interfaces or directly? List each direct call (file:line), (3) Where would mocking be hard right now? Rate adherence 1-5 with rationale.
7. Bounded-context drift audit
Identify implicit bounded contexts in this codebase (group modules by entities they share). Then: (1) Which contexts share the same entity but disagree on its shape? (file:line for each), (2) Which contexts secretly depend on each other through shared mutable state? (3) Suggest one context boundary to make explicit first.
8. Cross-cutting concern leak
Audit cross-cutting concerns: logging, auth, telemetry, feature flags, error handling. For each: (1) Is it implemented centrally or sprinkled? (2) List 5 sites where the concern is reimplemented inline, (3) Suggest one extraction strategy (decorator, middleware, hook). Do not perform the extraction.
9. Shared kernel risk audit
Identify the "shared kernel" — code imported by > 3 modules. For each shared item: (1) Why does it need to be shared? (2) Is the shape stable, or does it change every sprint? (3) Score coupling risk (Low / Med / High). Flag shared kernel items that are actually leaky abstractions.
10. Async / sync boundary audit
Map ASYNC vs SYNC boundaries. Find: (1) sync code that blocks on async (sync over async) — file:line, (2) async code that swallows promise rejection, (3) "fire and forget" calls without retry / dead-letter, (4) mixed paradigms in the same call chain. Output: severity-ranked table.
11. Configuration architecture audit
Audit how CONFIG flows: (1) Where are env vars read? Centralized or scattered? List read sites, (2) Are defaults / fallbacks documented? (3) Is there a single typed config object, or is `process.env.X` reached directly? (4) Suggest a config-loading pattern for this stack.
12. Plugin / extension surface audit
If this codebase exposes a plugin or extension surface, audit: (1) What contract do extensions implement? (2) What internals are accidentally reachable? (3) How is versioning handled? If no extension surface exists, say so — do not invent one.
13. Read / write asymmetry audit (CQRS lite)
For the 3 most-used entities, separate READ paths from WRITE paths. Find: (1) Reads that pull through write models unnecessarily, (2) Writes that bypass invariant enforcement, (3) Queries that join 4+ tables (candidates for read models). Suggest one read/write split worth doing first.
14. Multi-service boundary audit
Use when the repo is a monorepo with multiple deployables.
This monorepo contains {services}. Audit the SERVICE boundaries: (1) Which packages are imported across service lines? (2) Where do contract types diverge between services? (3) Is there shared DB access (anti-pattern)? (4) Suggest one boundary to harden first.
Variables to swap: services — e.g., “web (Next.js), api (FastAPI), worker (Go)“
15. Architecture findings → ADR backfill
Run last — converts findings into decision records.
Take the architecture findings from previous prompts. For each significant decision implied (or contradicted) by the code, draft a short ADR: Title, Status (Accepted / Proposed / Deprecated), Context (2 sentences), Decision (1 sentence), Consequences (3 bullets). Output 5 ADRs maximum, ranked by impact.
Common mistakes
- Asking “review my architecture” without naming a dimension — you get textbook patterns, not your bugs.
- Letting AI propose a rewrite in the same pass as the review — diagnostic clarity collapses.
- No file:line evidence required — every finding is unverifiable.
- Inferring layers from titles instead of imports — judge from the actual dependency graph.
- Reviewing the whole repo in one prompt — context dilutes and findings blur together.
- Treating AI output as final — pair with one human walk-through of the top 3 findings.
- Skipping ADR backfill — the findings become read-only documents nobody acts on.
How to push results further
- Run each architecture dimension in a separate thread — layering, cycles, boundaries, data flow.
- Demand
file:lineevidence on every finding. Hallucinations evaporate when evidence is required. - Add a severity enum (Critical / High / Med / Low) in your prompt to force ranking.
- Ask AI to first infer the intended layer model from folders, then judge violations against it.
- For monorepos, run boundary audits per service-pair, not globally.
- Save architecture audits as dated markdown in
/docs/architecture/audits/so you can diff over time. - Pair every audit with template 15 (ADR backfill) — without it, findings rot.
FAQ
- Should I run architecture review on every PR?: No — too noisy. Run before refactors, on inheritance, and quarterly. Use PR review prompts for diffs.
- What if my codebase has no explicit layers?: Have AI infer the intended layer model from folder names first, then judge violations. Document the inferred model as an ADR.
- Will AI hallucinate cycles?: Sometimes. Always require the exact import (file:line) for each edge — if the model can’t produce it, the cycle is fake.
- How is this different from a full repo audit?: Full repo audits scan many dimensions shallowly (security, deps, tests). Architecture review goes deep on structural questions only.
- Can I run this on microservices spread across repos?: Yes — paste service interface definitions and message contracts. The boundary audit (template 14) works best when you can supply both sides.
- How long should this take?: For a 30K-LOC repo: 20-40 minutes per dimension, then 1-2 hours to triage and backfill ADRs.