Unit Test Generation Prompts: 14 Templates That Actually Catch Bugs

Stop asking AI to "write tests for this." 14 unit-test prompt templates for boundary cases, error paths, mocks, fakes, parameterized suites, and regression locks.

“Write unit tests for this file” is the laziest test-gen prompt and it shows: you get 5 happy-path assertions, no error branches, and mocks for everything that should not be mocked. A good unit-test prompt names the boundaries (null / empty / max / negative), forbids over-mocking, and demands at least one regression test that locks behaviour, not just shape.

Who this is for

Developers shipping under deadline who need real coverage on new modules, tech leads enforcing test discipline, devs adding tests retroactively to legacy code, indie devs running Claude Code in agentic mode.

When not to use these prompts

Don’t use these for code you don’t plan to maintain (throwaway scripts) — the test burden outweighs the value. Also don’t use AI to write tests for code you don’t yet understand; you’ll cement bugs as “intended behaviour”.

Prompt anatomy / structure formula

Every unit-test generation prompt should carry six elements:

  • Role: who the AI plays (release captain / QA lead / SRE / staff engineer).
  • Context: repo / framework / runtime / branch / diff / failing logs.
  • Goal: one concrete deliverable — checklist, plan, test file, review notes, root cause, ticket list.
  • Constraints: what AI MUST NOT do (don’t auto-fix, don’t silently rewrite, don’t guess versions).
  • Output format: numbered findings, markdown table, JSON schema, unified diff, or runnable code.
  • Examples / signal: 1-2 examples of “good” output, or what bad output looks like.

Best for

  • New module written by AI that needs tests before merge
  • Adding coverage to a legacy function before refactor
  • Locking a bug-fix with a regression test
  • Parameterized table tests across many inputs
  • Replacing flaky expect(fn).toHaveBeenCalled() with behaviour tests

14 copy-ready prompt templates

1. Boundary-first unit tests

Run this BEFORE asking for any happy-path tests.

You are a senior test engineer. For the function `{functionName}` in `{filePath}`, write Jest tests that ONLY cover boundary and error inputs: null, undefined, empty string, empty array, max int, negative, NaN, very long string, malformed input. Do not write any happy-path test in this pass. Each test must have a clear "given / when / then" comment. Use `describe.each` for tables when inputs share a shape.

Variables to swap: functionName, filePath

Optimization: Add: “If a boundary cannot be reached because the function signature forbids it, list it as // unreachable: <reason> instead of inventing a test.”

2. Happy-path unit tests (second pass)

Now write happy-path Jest tests for `{functionName}`. For each public behaviour described in its docstring or surrounding comments, write one test. Use real values, not generic strings like "test". Use `it("returns X when Y")` naming. Do not duplicate any boundary case already covered.

Variables to swap: functionName

3. Mock discipline rules

Write tests for `{functionName}`. Mock ONLY: (a) network / fs / time / random, (b) external services. Do NOT mock: (a) pure functions in the same package, (b) data classes, (c) anything you could pass as a real value. If you reach for `jest.mock` on a same-package import, stop and use the real implementation instead. Note in a comment which dependencies you chose to mock and why.

Variables to swap: functionName

Optimization: For Python: replace “jest” with “pytest” + “monkeypatch / pytest-mock”.

4. Regression test from a bug

I just fixed this bug: {bugDescription}. The failing input was: {failingInput}. Write a single regression test named `regression: <bug-title>` that fails on the old code (before fix) and passes on the new code. Add a comment with the bug ticket link and the exact behaviour the test locks.

Variables to swap: bugDescription, failingInput

5. Parameterized table tests

Convert these example-based tests into a `describe.each` (Jest) / `pytest.mark.parametrize` table. Group by behaviour. Keep one row per distinct equivalence class — don't list 10 rows that all hit the same branch. Name the rows so a failure tells you which case broke.

Variables to swap: language: TS / Python / Go

6. Behaviour, not implementation

Best after a refactor where internal call sites changed.

Refactor these tests so they assert OUTPUTS / SIDE EFFECTS, not internal calls. Remove any `toHaveBeenCalledWith(internalHelper, …)` assertions. Replace with assertions on the function's return value or the externally observable side effect (DB row, emitted event, file written).

7. Async + timer testing

For the async function `{functionName}` that uses `setTimeout` / `setInterval` / `Promise.race`, write Jest tests using `jest.useFakeTimers()`. Cover: (1) resolves before timeout, (2) rejects on timeout, (3) cleanup on abort, (4) no dangling timers after `await`. Avoid real `await new Promise(r => setTimeout(r, …))` — that's flake.

Variables to swap: functionName

8. Error-shape contract tests

Write tests that pin the ERROR SHAPE of `{functionName}`. For each documented error path, assert: (a) the thrown class / error code, (b) the message contains a stable phrase callers can match, (c) `cause` is preserved if wrapping. Errors are part of the API — these tests prevent silent breaking changes.

Variables to swap: functionName

9. Coverage-gap identifier

Pair with template 1 or 2 once gaps are agreed.

Read this test file and the function under test. Without running coverage, identify uncovered branches by inspection. Return a markdown list: branch (file:line) | reason it's missed | suggested test name. Don't write the tests yet — just list the gaps so I can prioritise.

10. Fakes over mocks

Replace these mocks with an in-memory FAKE. The fake should implement the same interface as the real dependency, store state, and let tests assert on that state. Mocks should remain only for I/O at the system boundary (network / DB driver). Show the fake class and 2 example tests that use it.

Variables to swap: interface: e.g., UserRepository, EmailSender

11. Property-based test scaffolding

For the pure function `{functionName}`, write property-based tests using {framework}. Identify 3 invariants (e.g., idempotence, commutativity, round-trip), and for each, write one property. Use shrinking-friendly generators. Don't test concrete examples — those belong in unit tests.

Variables to swap: functionName, framework: fast-check / hypothesis / proptest

12. Test naming + organisation

Reorganise these tests so each `describe` block represents one BEHAVIOUR of `{functionName}`, and each `it` reads as a sentence: `it("returns 0 when input is empty")`. Group setup with `beforeEach` only when ≥ 2 tests share it. Flag tests where the name doesn't match what the body actually asserts.

13. Snapshot strategy

Audit these snapshot tests: (1) Which snapshots lock behaviour (good) vs random output / dates / IDs (bad)? (2) Replace bad snapshots with targeted assertions, (3) For React components, snapshot the rendered text + ARIA roles, not the full HTML tree. Output a diff plan.

14. Test-first scaffold from a spec

Here is the spec for `{functionName}`:

{specText}

Write the tests FIRST (TDD style), as a single Jest file, before any implementation. Each test corresponds to one bullet in the spec. Mark unimplemented behaviour with `it.todo`. Do not write any implementation.

Variables to swap: functionName, specText

Common mistakes

  • Asking “write tests for this file” — no boundaries named, you get happy-path only.
  • Letting AI mock everything — you end up testing the mocks, not the code.
  • Snapshot-everything — flaky tests and zero diagnostic signal when they fail.
  • Letting AI write a test AND its assertion in the same pass without ground truth — it bakes in the bug as expected.
  • Naming tests “test1, test2” — when one fails the report tells you nothing.
  • Generating 50 tests at once — many will be duplicates; ask for boundaries-first then happy-path.
  • Skipping error-path tests — those are exactly the paths AI-written code most often gets wrong.

How to push results further

  • Run boundary-first then happy-path as two separate prompts — output quality jumps.
  • For a regression test, paste the failing input verbatim — AI fabricates inputs that don’t actually trigger the bug.
  • Demand // given / when / then comments — they force AI to articulate intent, surfacing mismatches.
  • After AI writes a test, run it against the OLD code first. If it passes there too, the test is wrong.
  • Use real-world fixture data, not "foo" strings — exposes more bugs.
  • For React: ask for ARIA-role / text assertions, not className / DOM structure.
  • Pair test generation with git diff paste — AI tests the changed code, not the whole file.

FAQ

  • Can I trust AI-generated tests as my coverage?: No, treat them as a draft. Always run them once against the old code — if they pass, they’re not testing what you think.
  • How many tests is “enough”?: Cover boundaries, error paths, and one happy path per branch. Stop when adding tests stops finding new diff coverage.
  • Should I let AI generate tests on uncovered legacy code?: Only after you read the code. Otherwise AI will treat existing buggy behaviour as “intended”.
  • Why are AI tests so heavy on mocks?: Default training. Override with template 3 and template 10 to force fakes / real values.
  • How do I avoid flakiness from time / random / network?: Inject these as dependencies, fake them in tests. Template 7 covers timers explicitly.
  • Does this work for Python / Go too?: Yes — swap Jest for pytest / go test. The structure is the same.

Tags: #Prompt #Coding #Testing #Unit test