E2E Test Plan Prompts: 13 Templates for Playwright / Cypress

Turn flaky, screenshot-heavy e2e suites into a small, fast, deterministic plan. 13 prompt templates for selectors, fixtures, auth, flakes, and PR coverage.

Most e2e suites die from rot, not bugs — flaky selectors, brittle waits, login that times out under CI load. A good e2e-plan prompt picks the right user journeys (NOT every page), names selector strategy (role / test-id), and forbids the things that cause flake (sleep, networkidle on dynamic pages).

Who this is for

Frontend leads choosing a Playwright / Cypress strategy, QA engineers writing test plans before implementation, indie devs running Claude Code over an existing flaky suite.

When not to use these prompts

Don’t use these to test internal component logic — that’s unit / component-test territory. Don’t use them on flows that change weekly — the test cost outweighs the value.

Prompt anatomy / structure formula

Every e2e plan prompt should carry six elements:

  • Role: who the AI plays (release captain / QA lead / SRE / staff engineer).
  • Context: repo / framework / runtime / branch / diff / failing logs.
  • Goal: one concrete deliverable — checklist, plan, test file, review notes, root cause, ticket list.
  • Constraints: what AI MUST NOT do (don’t auto-fix, don’t silently rewrite, don’t guess versions).
  • Output format: numbered findings, markdown table, JSON schema, unified diff, or runnable code.
  • Examples / signal: 1-2 examples of “good” output, or what bad output looks like.

Best for

  • Choosing the 5-8 user journeys worth e2e-testing
  • Setting selector / fixture / auth conventions before writing tests
  • Stabilising a flaky suite without throwing it away
  • Adding e2e coverage on a single PR (not full suite)
  • Migrating Cypress → Playwright (or vice versa)

13 copy-ready prompt templates

1. Journey selection

You are a QA lead. Given this app description: {appDescription}, list the 5-8 user journeys worth e2e-testing. For each: (1) one-line user goal, (2) the failure mode that would lose us revenue / users, (3) entry and exit URLs, (4) data setup needed. Stop at 8. Anything else belongs in unit / component tests.

Variables to swap: appDescription

2. Selector strategy

Audit these existing e2e tests for selector strategy. For each test, mark: ROLE (good — `getByRole("button", { name })`), TEST-ID (acceptable — `[data-testid]`), TEXT (acceptable for unique copy), or CSS / XPATH (bad). Replace CSS / XPATH selectors with role / text. Output a diff plan.

3. Auth fixture

Design an auth fixture for Playwright that: (1) signs in once per worker, (2) reuses the storage state across tests, (3) skips UI login for tests that don't exercise the login flow itself. Show the fixture code, the playwright.config.ts entry, and one example test consuming it.

4. Network stub strategy

For this test {testName}, decide for each network call: STUB (third-party / brittle / slow), REAL (the system-under-test's own API), or RECORD-REPLAY (rarely-changing reference data). Output a table: endpoint | strategy | reason. Don't stub our own backend except for explicit error-path tests.

Variables to swap: testName

5. Flake taxonomy + fix

Read these flaky test results from the last 7 days: {flakeLog}. Classify each flake as: TIMING (need `expect.toBeVisible()` not arbitrary wait), NETWORK (need stub or retry), STATE (test leak from previous test), ENVIRONMENT (CI vs local). For each, write a one-line fix recipe. Don't patch with retries — patch root cause.

Variables to swap: flakeLog

6. PR-scoped e2e coverage

For this diff {diff}, decide whether new e2e tests are needed. Criteria: changes touch a critical journey (template 1) AND change observable user behaviour. If yes, draft 1-3 e2e test outlines (not full code). If no, say "unit / component test is sufficient" and stop.

Variables to swap: diff

7. Mobile + responsive coverage

Add mobile coverage to this Playwright config. (1) Add 1 mobile project (Pixel 5) and 1 small-screen Chromium. (2) Pick 2 journeys from the existing suite to run on mobile (sign-up, checkout). (3) Use `test.use({ viewport })` for per-test overrides. Don't run the full suite on mobile.

8. Hermetic test data

For this test {testName}, propose a data-setup strategy that doesn't depend on prod state: (1) create user via API not UI, (2) seed needed records via fixture, (3) clean up in `afterEach` even if the test fails. Show one example using API factories.

Variables to swap: testName

9. CI sharding plan

Our Playwright suite takes 25 minutes. Design a sharding plan to bring it under 7 minutes on 4 shards: (1) Group tests by file (default) or by tag, (2) Avoid auth-state contention across shards, (3) Don't shard the smoke subset. Output the workflow YAML diff.

10. Visual regression scope

Pick the 3 screens worth visual-regression-snapshotting: (1) homepage hero, (2) the one screen where layout drift hurts conversions most, (3) any screen with a CSS variable change in the diff. Don't snapshot pages that change with real data (lists, dashboards). Output the test stubs.

11. Accessibility checks in e2e

Add `@axe-core/playwright` to 3 critical pages: home, sign-up, checkout. For each: assert no violations of severity `serious` or higher. Allow `moderate` for now with a ticket comment. Don't fail on `minor` — too noisy. Show the helper and the 3 tests.

12. Cypress → Playwright migration

I have {nTests} Cypress tests. Migrate them to Playwright in priority order: (1) journeys from template 1 first, (2) high-flake tests next (rewrite, don't port), (3) low-value tests last (consider deleting). Output a migration tracker with status per test.

Variables to swap: nTests

13. Test-plan markdown for stakeholders

Turn the full e2e plan into a 1-page markdown doc for non-engineers: (1) Which journeys are tested, (2) Roughly which devices / browsers, (3) Approximate run time, (4) What we deliberately DON'T test (and why). Plain English. No "Playwright" / "fixtures" jargon.

Common mistakes

  • Trying to e2e-test every page — leads to a 2-hour suite that everyone skips.
  • Using CSS / XPATH selectors — refactor breaks the test, not the assertion.
  • Sleeping await page.waitForTimeout(2000) to “fix” flake — masks the root cause.
  • Logging in through UI in every test — slow and brittle.
  • Stubbing the system-under-test’s own API — tests pass when prod is broken.
  • Snapshotting full pages — every dynamic value causes false failures.
  • Sharing test users across tests — order-dependent failures.

How to push results further

  • Cap e2e at 5-8 journeys. Move everything else to component / unit.
  • Use getByRole + accessible name. Selectors stay valid through refactors and improve a11y at the same time.
  • expect(locator).toBeVisible({ timeout }) instead of waitForTimeout.
  • Sign in once per worker via API; reuse storage state.
  • Stub third-party APIs (Stripe / OAuth / analytics), let real APIs run.
  • Tag flaky tests with @flaky, run them in a separate workflow, fix them or delete them weekly.
  • Run smoke (3 tests) on every PR; full suite on main / nightly.

FAQ

  • How many e2e tests are too many?: When the suite > 10 min on a 4-core runner, or when devs start ignoring failures. Trim back to journeys.
  • Cypress or Playwright in 2026?: Playwright unless you’re heavily invested in Cypress. Multi-tab, mobile emulation, and parallelism are cleaner.
  • Should e2e block merge?: Smoke subset yes; full suite no — too slow. Run full on main and revert on red.
  • Can AI write the tests too?: Yes for skeleton, no for the data setup — AI invents seed data that doesn’t exist.
  • How do I deal with auth on third-party SSO?: Use a programmatic token endpoint or storage-state fixture. Never log in through Google’s real UI in CI.
  • Do I need visual regression?: For 3 screens, yes. Beyond that, the false-failure cost exceeds the bug-catching value.

Tags: #Prompt #Coding #Testing #E2E #Playwright