Can I trust AI-generated tests as my coverage?

No. Treat them as a draft. Independent 2026 reviews rate the strongest test-gen models highly, but they still produce assertions coupled to implementation details. Always run a new test against the old code once — if it passes, it isn't testing what you think.

How many tests is "enough"?

Cover boundaries, error paths, and one happy path per branch. Stop when adding tests stops finding new diff coverage.

Should I let AI generate tests on uncovered legacy code?

Only after you read the code. Otherwise AI treats existing buggy behaviour as "intended".

Why are AI tests so heavy on mocks?

It's a training default. Override with template 3 and template 10 to force fakes and real values.

How do I avoid flakiness from time / random / network?

Inject them as dependencies and fake them in tests. Template 7 covers timers with `vi.useFakeTimers()` / `jest.useFakeTimers()`.

Vitest or Jest for new tests in 2026?

For new TS/JS projects, Vitest 4 is the common default — native ESM and much faster watch mode. Keep Jest 30 for legacy CommonJS, big monorepos, or React Native. Either way the prompts above apply; just name the runner.

Does this work for Python and Go?

Yes. Swap the runner for pytest 9 or `go test`; the boundary-first, fakes-over-mocks, regression-lock structure is identical.

Prompt Library

Unit Test Generation Prompts: 14 Templates That Catch Real Bugs

Stop asking AI to 'write tests for this.' 14 copy-ready unit-test prompts for boundaries, error paths, mocks vs fakes, parameterized suites, and regression locks — tuned for Vitest 4, Jest 30, pytest 9.

Published: May 19, 2026 Updated: Jun 14, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

“Write unit tests for this file” is the laziest test-gen prompt, and it shows: you get five happy-path assertions, no error branches, and mocks for everything that should not be mocked. A good unit-test prompt names the boundaries (null / empty / max / negative), forbids over-mocking, and demands at least one regression test that locks behaviour, not just shape.

TL;DR

Run boundaries FIRST as a separate prompt, then happy-path second. Splitting the passes reliably surfaces more real bugs than one “write all the tests” request.
Force mock discipline: mock only I/O at the system boundary (network, filesystem, clock, randomness, DB driver); use real values or in-memory fakes for everything inside your package.
Anchor every regression test to a verbatim failing input. AI fabricates inputs that don’t actually reproduce the bug.
Always run an AI-written test against the OLD code once. If it passes there too, the test is wrong.
These templates are framework-agnostic. Defaults below target Vitest 4 / Jest 30 (TS/JS) and pytest 9 (Python); swap the runner name and the rest holds.

Who this is for

Developers shipping under deadline who need real coverage on new modules, tech leads enforcing test discipline, engineers backfilling tests on legacy code, and indie devs running Claude Code or Cursor in agentic mode. In the June 2026 coding evals from independent reviewers, Claude Code scored highest for test generation, with Cursor the strongest AI-native IDE pick — but both still default to over-mocking, which is exactly what these prompts correct.

When not to use these prompts

Don’t write tests for code you don’t plan to maintain (throwaway scripts) — the upkeep outweighs the value. And don’t ask AI to test code you don’t yet understand; you will cement existing bugs as “intended behaviour” the moment the test goes green.

Pick your runner first (June 2026)

The prompts work with any runner, but the version notes change a few flags. Quick reality check before you generate:

Runner	Current version	Best for	Note for prompting
Vitest	4.1 (Mar 2026)	New TS/JS + ESM projects, Vite apps	Browser Mode is stable in v4; watch re-runs are typically much faster than Jest on the same suite
Jest	30 (Jun 2025)	Legacy CommonJS, large monorepos, React Native	CJS-first; ESM still behind experimental flags. Tell AI your module system
pytest	9.1 (Jun 2026)	Python	Past the 8.x line; use `monkeypatch` / `pytest-mock`, not raw `unittest.mock` everywhere

Vitest and Jest share the same matcher and mock surface, so a Jest prompt usually runs under Vitest with vi swapped for jest. Name the runner explicitly in the prompt so the AI emits the right imports.

Prompt anatomy

Every unit-test generation prompt should carry six elements:

Role: who the AI plays (senior test engineer / QA lead / staff engineer).
Context: file, framework, runtime, the function signature, any docstring.
Goal: one concrete deliverable — boundary suite, happy-path suite, regression test, gap list.
Constraints: what AI MUST NOT do (don’t over-mock, don’t invent inputs, don’t write the implementation).
Output format: runnable test file, markdown gap table, or unified diff.
Example / signal: one well-written test from your own codebase, so output matches your naming and assertion style.

That last element does the heavy lifting. Pasting one real test from your repo anchors the AI to your conventions far better than any description.

Best for

New module written by AI that needs tests before merge
Adding coverage to a legacy function before a refactor
Locking a bug fix with a regression test
Parameterized table tests across many inputs
Replacing flaky expect(fn).toHaveBeenCalled() with behaviour tests

14 copy-ready prompt templates

Variables in [brackets] are placeholders — replace them with your real values before sending.

1. Boundary-first unit tests

Run this BEFORE asking for any happy-path tests.

You are a senior test engineer. For the function [functionName] in [filePath], write Vitest tests that ONLY cover boundary and error inputs: null, undefined, empty string, empty array, max int, negative, NaN, very long string, malformed input. Do not write any happy-path test in this pass. Each test must have a clear "given / when / then" comment. Use describe.each for tables when inputs share a shape.

Variables to swap: functionName, filePath

Optimization: Add: “If a boundary cannot be reached because the function signature forbids it, list it as an // unreachable: <reason> comment instead of inventing a test.”

2. Happy-path unit tests (second pass)

Now write happy-path Vitest tests for [functionName]. For each public behaviour described in its docstring or surrounding comments, write one test. Use real values, not generic strings like "test". Use it("returns X when Y") naming. Do not duplicate any boundary case already covered.

Variables to swap: functionName

3. Mock discipline rules

Write tests for [functionName]. Mock ONLY: (a) network / fs / time / random, (b) external services. Do NOT mock: (a) pure functions in the same package, (b) data classes, (c) anything you could pass as a real value. If you reach for vi.mock (or jest.mock) on a same-package import, stop and use the real implementation instead. Note in a comment which dependencies you chose to mock and why.

Variables to swap: functionName

Optimization: For Python (pytest 9): replace the mock helpers with monkeypatch / pytest-mock.

4. Regression test from a bug

I just fixed this bug: [bugDescription]. The failing input was: [failingInput]. Write a single regression test named "regression: <bug-title>" that fails on the old code (before fix) and passes on the new code. Add a comment with the bug ticket link and the exact behaviour the test locks.

Variables to swap: bugDescription, failingInput

5. Parameterized table tests

Convert these example-based tests into a describe.each (Vitest / Jest) or pytest.mark.parametrize table. Group by behaviour. Keep one row per distinct equivalence class; don't list 10 rows that all hit the same branch. Name the rows so a failure tells you which case broke.

Variables to swap: language: TS / Python / Go

6. Behaviour, not implementation

Best after a refactor where internal call sites changed.

Refactor these tests so they assert OUTPUTS / SIDE EFFECTS, not internal calls. Remove any toHaveBeenCalledWith(internalHelper, ...) assertions. Replace with assertions on the function's return value or the externally observable side effect (DB row, emitted event, file written).

7. Async + timer testing

For the async function [functionName] that uses setTimeout / setInterval / Promise.race, write Vitest tests using vi.useFakeTimers() (or jest.useFakeTimers()). Cover: (1) resolves before timeout, (2) rejects on timeout, (3) cleanup on abort, (4) no dangling timers after await. Avoid real await new Promise(r => setTimeout(r, ...)); that's flake.

Variables to swap: functionName

8. Error-shape contract tests

Write tests that pin the ERROR SHAPE of [functionName]. For each documented error path, assert: (a) the thrown class / error code, (b) the message contains a stable phrase callers can match, (c) cause is preserved if wrapping. Errors are part of the API; these tests prevent silent breaking changes.

Variables to swap: functionName

9. Coverage-gap identifier

Pair with template 1 or 2 once gaps are agreed.

Read this test file and the function under test. Without running coverage, identify uncovered branches by inspection. Return a markdown table: branch (file:line) | reason it's missed | suggested test name. Don't write the tests yet; just list the gaps so I can prioritise.

10. Fakes over mocks

Replace these mocks with an in-memory FAKE. The fake should implement the same interface as the real dependency, store state, and let tests assert on that state. Mocks should remain only for I/O at the system boundary (network / DB driver). Show the fake class and 2 example tests that use it.

Variables to swap: interface, e.g. UserRepository, EmailSender

11. Property-based test scaffolding

For the pure function [functionName], write property-based tests using [framework]. Identify 3 invariants (e.g. idempotence, commutativity, round-trip), and for each, write one property. Use shrinking-friendly generators. Don't test concrete examples; those belong in unit tests.

Variables to swap: functionName, framework: fast-check / hypothesis / proptest

12. Test naming + organisation

Reorganise these tests so each describe block represents one BEHAVIOUR of [functionName], and each it() reads as a sentence: it("returns 0 when input is empty"). Group setup with beforeEach only when 2 or more tests share it. Flag tests where the name doesn't match what the body actually asserts.

13. Snapshot strategy

Audit these snapshot tests: (1) Which snapshots lock behaviour (good) vs random output / dates / IDs (bad)? (2) Replace bad snapshots with targeted assertions. (3) For React components, snapshot the rendered text + ARIA roles, not the full HTML tree. Output a diff plan.

14. Test-first scaffold from a spec

Here is the spec for [functionName]:

[specText]

Write the tests FIRST (TDD style), as a single Vitest file, before any implementation. Each test corresponds to one bullet in the spec. Mark unimplemented behaviour with it.todo. Do not write any implementation.

Variables to swap: functionName, specText

Common mistakes

Asking “write tests for this file” with no boundaries named — you get happy-path only.
Letting AI mock everything — you end up testing the mocks, not the code.
Snapshotting everything — flaky tests and zero diagnostic signal when they fail.
Letting AI write a test AND its assertion in one pass without ground truth — it bakes in the bug as expected.
Naming tests “test1, test2” — when one fails, the report tells you nothing.
Generating 50 tests at once — many will be duplicates; ask for boundaries first, then happy-path.
Skipping error-path tests — those are exactly the paths AI-written code most often gets wrong.

How to push results further

Run boundary-first, then happy-path as two separate prompts; output quality jumps noticeably.
For a regression test, paste the failing input verbatim. AI fabricates inputs that don’t actually trigger the bug.
Demand // given / when / then comments; they force AI to articulate intent, surfacing mismatches.
After AI writes a test, run it against the OLD code first. If it passes there too, the test is wrong.
Use real fixture data, not "foo" strings; it exposes more bugs.
For React, ask for ARIA-role and text assertions, not className or DOM structure.
Pair test generation with a git diff paste so AI tests the changed code, not the whole file.
In agentic tools (Claude Code, Cursor), let the agent run the suite and iterate — but review the assertions yourself, since a passing AI test only proves it matches whatever the code currently does.

FAQ

Can I trust AI-generated tests as my coverage?: No. Treat them as a draft. Independent 2026 reviews rate the strongest test-gen models highly, but they still produce assertions coupled to implementation details. Always run a new test against the old code once — if it passes, it isn’t testing what you think.
How many tests is “enough”?: Cover boundaries, error paths, and one happy path per branch. Stop when adding tests stops finding new diff coverage.
Should I let AI generate tests on uncovered legacy code?: Only after you read the code. Otherwise AI treats existing buggy behaviour as “intended”.
Why are AI tests so heavy on mocks?: It’s a training default. Override with template 3 and template 10 to force fakes and real values.
How do I avoid flakiness from time / random / network?: Inject them as dependencies and fake them in tests. Template 7 covers timers with vi.useFakeTimers() / jest.useFakeTimers().
Vitest or Jest for new tests in 2026?: For new TS/JS projects, Vitest 4 is the common default — native ESM and much faster watch mode. Keep Jest 30 for legacy CommonJS, big monorepos, or React Native. Either way the prompts above apply; just name the runner.
Does this work for Python and Go?: Yes. Swap the runner for pytest 9 or go test; the boundary-first, fakes-over-mocks, regression-lock structure is identical.

Tags: #Prompt #Coding #Testing #Unit test

TL;DR

Who this is for

When not to use these prompts

Pick your runner first (June 2026)

Prompt anatomy

Best for

14 copy-ready prompt templates

1. Boundary-first unit tests

2. Happy-path unit tests (second pass)

3. Mock discipline rules

4. Regression test from a bug

5. Parameterized table tests

6. Behaviour, not implementation

7. Async + timer testing

8. Error-shape contract tests

9. Coverage-gap identifier

10. Fakes over mocks

11. Property-based test scaffolding

12. Test naming + organisation

13. Snapshot strategy

14. Test-first scaffold from a spec

Common mistakes

How to push results further

FAQ

Related

Related Articles

Accessibility Regression Audit Prompts: 12 Templates Beyond axe-core

Bug Reproduction Prompts: 12 Templates for Minimal Repro Cases

Build Failure Investigation Prompts: 12 Templates for Red CI

Changelog Generation Prompts: 12 Templates for Useful Release Notes

CI/CD Pipeline Audit Prompts for Fast, Trustworthy Builds

Codebase Convention Detection Prompts: 12 Templates to Read a New Repo