“Why is CI so slow?” usually has an answer in the YAML — but no one wants to read 600 lines of it. A good pipeline-audit prompt names dimensions (cache, parallelism, secrets, gates), demands evidence, and produces ranked actions.
Who this is for
Platform engineers, tech leads tired of 25-minute CI, founders trying to ship faster, anyone who has paid a CI bill that grew faster than the team.
When not to use these prompts
Don’t use these on pipelines without any tests — the audit can’t fix a missing concept. Don’t apply pipeline tweaks without a perf baseline.
Prompt anatomy / structure formula
Every pipeline audit prompt should carry six elements:
- Role: who AI plays (SRE / release captain / staff engineer / QA lead).
- Context: stack / branch / failing logs / diff / dashboard URL.
- Goal: one concrete deliverable — root cause, checklist, plan, ticket list, runbook.
- Constraints: what AI MUST NOT do (don’t auto-fix, don’t hallucinate file paths).
- Output format: numbered findings, markdown table, JSON, unified diff, runnable code.
- Examples / signal: 1-2 “good” output examples, or counter-examples.
Best for
- Cutting CI runtime in half
- Verifying gates actually fail when they should
- Secrets / OIDC migration
- Self-hosted vs managed runner decisions
- Bill-of-materials audit
12 copy-ready prompt templates
1. End-to-end pipeline audit
You are a platform engineer. Audit this `{filename}` for: (1) Total runtime + biggest single step, (2) Cache hit rate signals (missing keys, stale paths), (3) Parallelism opportunities, (4) Gates that warn but don't fail, (5) Secrets exposure. Output a ranked action list.
Variables to swap: filename — .github/workflows/ci.yml etc.
2. Cache audit
Audit caching for this pipeline. For each cache step: (1) Cache key — content-hash based or static? (2) Path — does it actually cover the heavy install? (3) Restore-keys correctly listed for partial hits? (4) TTL / invalidation strategy. Output a fix per cache.
3. Parallelism audit
Find parallelism opportunities: (1) Jobs that needlessly `needs:` another, (2) Tests that could shard, (3) Build + lint + typecheck running sequential when they could be parallel, (4) Matrix entries that don't need full coverage. Output a YAML diff.
4. Gate honesty audit
Audit gates: which steps are `continue-on-error: true`, set `if: always()`, or report success while the underlying tool failed? Output a table: step | currently | should be | severity. Flag any gate that has masked a real failure.
5. Secrets / OIDC audit
Audit secret handling: (1) Long-lived secrets that could move to OIDC, (2) Secrets used in `echo` / step output, (3) PRs from forks with access to secrets, (4) Secret names that look like they're leaking purpose. Output remediation list.
6. Self-hosted vs managed decision
We currently use {provider}. Decide whether to add self-hosted runners for: (1) Heavy CPU steps (build, e2e), (2) Steps needing custom OS, (3) Steps that hit private network. For each: cost / maintenance estimate + recommendation.
Variables to swap: provider — GitHub Actions / GitLab / CircleCI
7. Build matrix audit
Audit the build matrix: (1) Are all combinations necessary, or do some only catch known-issues? (2) Could we run only `node-lts` on PR and full matrix on main? (3) Are deprecated versions still tested? Output a trimmed matrix.
8. Required vs blocking checks
List which checks are currently REQUIRED by branch protection. Decide for each: keep / move to optional / remove. Criteria: false-positive rate, runtime, redundancy with another check. Output a table.
9. Bill audit
Our CI cost {monthlyCost}. Audit for spend: (1) Top 3 jobs by minutes, (2) % of runs cancelled mid-way, (3) PR-triggered runs that could be path-filtered, (4) Cron jobs running too often. Output top 3 savings.
Variables to swap: monthlyCost
10. Path-filter opportunities
Find path-filter opportunities: (1) Frontend-only PRs that don't need backend tests, (2) Docs-only PRs running full e2e, (3) Mobile changes triggering web pipeline. Output `paths:` blocks per workflow.
11. Reusable workflow extraction
Identify steps repeated across 3+ workflows that could be a reusable workflow / composite action: (1) Setup (node + pnpm + cache), (2) Lint, (3) Test reporters, (4) Deploy. Output the refactor plan + the reusable workflow stub.
12. PR-impact heatmap
For the last 50 PRs, count how often each pipeline job ran AND whether it actually exercised the changed files. Identify the top 3 jobs that ran on PRs they couldn't fail. Output a path-filter or conditional to skip them.
Common mistakes
- Caching
node_modulesdirectly — slow restore, breaks across OS. - Running the full e2e suite on every PR.
continue-on-error: trueon tests “so they don’t block”.- PRs from forks with secret access — common credential leak path.
- No path filters — every PR runs every job.
- Self-hosted runners without a managed lifecycle — security risk.
- Big matrix on every PR; full matrix should only run on main / release.
How to push results further
- Content-hash your cache keys (lockfile, package.json).
- Use OIDC instead of long-lived cloud secrets where possible.
- Split required vs optional checks deliberately — required = trust signal.
- Path filters cut spend more than runner upgrades.
- Reusable workflows clear duplication and centralise security patches.
- Track CI minutes per PR — it surfaces flake-fix candidates.
- Fail fast: order cheap-fast checks (lint, typecheck) before expensive ones (e2e).
Practical depth notes
Use these prompts as starting points, not final answers. For CI/CD Pipeline Audit Prompts for Fast, Trustworthy Builds, the useful extra work is to replace every generic placeholder with a real constraint: audience, channel, length, brand voice, examples to imitate, and examples to avoid. Run at least two versions with different constraints, then compare the outputs side by side instead of accepting the first polished response.
A good result should pass three checks: it is specific enough that another person could reuse it, it avoids vague praise or filler, and it gives you an editable artifact rather than a broad suggestion. If the output feels generic, add one concrete reference, one forbidden pattern, and one measurable success criterion before rerunning the prompt.
FAQ
- When should I move to self-hosted runners?: When managed costs > self-hosted maintenance + you need custom hardware or private network.
- How fast should CI be?: < 10 min on a typical PR. Beyond that, developer behaviour degrades.
- Is GitHub Actions cache reliable?: Yes, but cache keys must be deterministic. Avoid date-based keys.
- Can AI write the pipeline?: Draft yes, ship-as-is no. Always review security boundaries and secrets.
- What about Docker layer caching?: Useful for image builds. Make sure to push cache to a registry, not just local Docker.
- How do I detect false-green CI?: Audit gate honesty (template 4) and grep for
continue-on-error/|| true.