E2E 测试计划 Prompt:13 个 Playwright / Cypress 模板

把脆弱、截图泛滥的 e2e 套件变成小、快、确定的计划。13 个 Prompt 模板——选择器、fixture、登录、flake、PR 级覆盖。

大部分 e2e 套件不是死于 bug,而是死于腐烂——选择器脆、等待不可靠、CI 下登录超时。好的 e2e 计划 Prompt 必须选对用户旅程(不是每个页面都测),指定选择器策略(role / test-id),并禁止造成 flake 的写法(sleep、动态页面上的 networkidle)。

适合哪些场景

在选 Playwright / Cypress 策略的前端 lead、写实施前测试计划的 QA、用 Claude Code 接手脆弱套件的独立开发者。

什么时候不建议这样写 Prompt

别拿 e2e 测组件内部逻辑——那是 unit / component test 的领域。每周都在变的流程别 e2e——成本远大于收益。

Prompt 结构公式

每个 e2e 计划 Prompt 都要带这六个要素:

  • 角色:让 AI 扮演谁(Release Captain / QA Lead / SRE / staff 工程师)。
  • 上下文:仓库 / 框架 / 运行时 / 分支 / diff / 失败日志。
  • 目标:一个具体可交付物——checklist、计划、测试文件、review 笔记、根因、ticket 列表。
  • 限制:AI 不能做什么(别自动修、别静默改写、别瞎猜版本号)。
  • 输出格式:编号清单、markdown 表格、JSON schema、unified diff、可直接运行的代码。
  • 示例 / 信号:1-2 条”好输出”示例,或者说明什么是糟糕输出。

这套 Prompt 适合用在哪

  • 选出值得 e2e 测的 5-8 条用户旅程
  • 写测试前先定下选择器 / fixture / 登录约定
  • 在不重写整个套件的前提下稳定化
  • 单 PR 范围的 e2e 覆盖(不跑整套)
  • Cypress → Playwright(或反向)迁移

13 个可直接复制的 Prompt 模板

1. 旅程选择

You are a QA lead. Given this app description: {appDescription}, list the 5-8 user journeys worth e2e-testing. For each: (1) one-line user goal, (2) the failure mode that would lose us revenue / users, (3) entry and exit URLs, (4) data setup needed. Stop at 8. Anything else belongs in unit / component tests.

可替换变量: appDescription 应用一句话描述

2. 选择器策略

Audit these existing e2e tests for selector strategy. For each test, mark: ROLE (good — `getByRole("button", { name })`), TEST-ID (acceptable — `[data-testid]`), TEXT (acceptable for unique copy), or CSS / XPATH (bad). Replace CSS / XPATH selectors with role / text. Output a diff plan.

3. 登录 fixture

Design an auth fixture for Playwright that: (1) signs in once per worker, (2) reuses the storage state across tests, (3) skips UI login for tests that don't exercise the login flow itself. Show the fixture code, the playwright.config.ts entry, and one example test consuming it.

4. 网络 stub 策略

For this test {testName}, decide for each network call: STUB (third-party / brittle / slow), REAL (the system-under-test's own API), or RECORD-REPLAY (rarely-changing reference data). Output a table: endpoint | strategy | reason. Don't stub our own backend except for explicit error-path tests.

可替换变量: testName

5. flake 分类 + 修法

Read these flaky test results from the last 7 days: {flakeLog}. Classify each flake as: TIMING (need `expect.toBeVisible()` not arbitrary wait), NETWORK (need stub or retry), STATE (test leak from previous test), ENVIRONMENT (CI vs local). For each, write a one-line fix recipe. Don't patch with retries — patch root cause.

可替换变量: flakeLog 近 7 天 flake 日志

6. PR 级 e2e 覆盖

For this diff {diff}, decide whether new e2e tests are needed. Criteria: changes touch a critical journey (template 1) AND change observable user behaviour. If yes, draft 1-3 e2e test outlines (not full code). If no, say "unit / component test is sufficient" and stop.

可替换变量: diff

7. 移动端 / 响应式

Add mobile coverage to this Playwright config. (1) Add 1 mobile project (Pixel 5) and 1 small-screen Chromium. (2) Pick 2 journeys from the existing suite to run on mobile (sign-up, checkout). (3) Use `test.use({ viewport })` for per-test overrides. Don't run the full suite on mobile.

8. 隔离测试数据

For this test {testName}, propose a data-setup strategy that doesn't depend on prod state: (1) create user via API not UI, (2) seed needed records via fixture, (3) clean up in `afterEach` even if the test fails. Show one example using API factories.

可替换变量: testName

9. CI 分片计划

Our Playwright suite takes 25 minutes. Design a sharding plan to bring it under 7 minutes on 4 shards: (1) Group tests by file (default) or by tag, (2) Avoid auth-state contention across shards, (3) Don't shard the smoke subset. Output the workflow YAML diff.

10. 视觉回归范围

Pick the 3 screens worth visual-regression-snapshotting: (1) homepage hero, (2) the one screen where layout drift hurts conversions most, (3) any screen with a CSS variable change in the diff. Don't snapshot pages that change with real data (lists, dashboards). Output the test stubs.

11. e2e 里的无障碍检查

Add `@axe-core/playwright` to 3 critical pages: home, sign-up, checkout. For each: assert no violations of severity `serious` or higher. Allow `moderate` for now with a ticket comment. Don't fail on `minor` — too noisy. Show the helper and the 3 tests.

12. Cypress → Playwright 迁移

I have {nTests} Cypress tests. Migrate them to Playwright in priority order: (1) journeys from template 1 first, (2) high-flake tests next (rewrite, don't port), (3) low-value tests last (consider deleting). Output a migration tracker with status per test.

可替换变量: nTests 测试条数

13. 给非工程师看的测试计划

Turn the full e2e plan into a 1-page markdown doc for non-engineers: (1) Which journeys are tested, (2) Roughly which devices / browsers, (3) Approximate run time, (4) What we deliberately DON'T test (and why). Plain English. No "Playwright" / "fixtures" jargon.

容易踩的坑

  • 想 e2e 测每个页面——套件 2 小时,没人愿意跑。
  • 用 CSS / XPATH 选择器——重构一改就崩。
  • waitForTimeout(2000) 修 flake——只是盖住根因。
  • 每个测试都走 UI 登录——又慢又脆。
  • stub 自己的后端 API——线上挂了测试还是绿的。
  • 整页 snapshot——动态值天天 false fail。
  • 测试间共享用户——出现执行顺序依赖。

优化技巧

  • e2e 限制在 5-8 条旅程,其它进 unit / component。
  • getByRole + accessible name——重构无碍且顺带提升 a11y。
  • expect(locator).toBeVisible({ timeout }) 取代 waitForTimeout
  • 每个 worker 用 API 登录一次,复用 storage state。
  • 第三方 API(Stripe / OAuth / analytics)stub,自家 API 放过。
  • @flaky 标记,单独跑这条流水线,每周修或删。
  • 每个 PR 跑 smoke(3 条),main / nightly 才跑全套。

FAQ

  • e2e 多少条算多?: 4 核 runner 跑超过 10 分钟、或开发开始忽略失败时,就太多了。回到旅程。
  • 2026 年选 Cypress 还是 Playwright?: 没有历史包袱直接 Playwright——多 tab、移动端模拟、并行更干净。
  • e2e 该卡合并吗?: smoke 子集应该卡;全套别——太慢。主分支跑全套,红了立刻回滚。
  • AI 能写测试吗?: 骨架可以,数据 setup 不行——AI 会编一份根本不存在的种子数据。
  • 第三方 SSO 的登录怎么办?: 走 token 接口或 storage-state fixture,永远别在 CI 里走 Google 的真实 UI。
  • 一定要做视觉回归吗?: 3 个关键屏值得,更多就得失大于得。

相关阅读

标签: #Prompt #编程 #测试 #E2E #Playwright