差评分析 Prompt:根因聚类模板

差评分析 Prompt——按根因聚类 1-2 星、分离症状与真问题、找出最能提评分的 3 项修复。

1-2 星评论几乎从不是表面那回事。用户写”iPhone 12 崩溃”,根因是登录流程回归。下面 15 个 Prompt 按根因聚类(不是主题)、关联到产品模块、区分一次性 bug 暴增与慢性模式、产出对齐 sprint 的修复优先级。给移动 App 团队管理评分速率用。

适合哪些场景

移动 App PM、App 工作室客服负责人、关注评分速率的增长团队、需要从糟糕版本恢复的创始人。

什么时候不建议这样写 Prompt

不到 50 条评论的新 App 不要用——逐条读。一次性喷子评论也别用,那是审核问题不是分析问题。

Prompt 结构公式

差评分析 Prompt 一定带这六个要素:

  • 角色:让 AI 扮演谁(资深 PM / 独立创始人 / 产品设计师 / 独立开发者 / 增长负责人)。
  • 上下文:阶段(想法 / MVP / 增长 / 规模化)、团队规模、流量或 ARR、平台(web / iOS / Android)、受众、限制。
  • 目标:一个具体交付物——一段 PRD、一组用户故事、一个实验设计、一篇上线公告。
  • 限制:时间线(本 sprint / 本季度)、要砍的范围、不能动的东西(现有流程、计费、合规)。
  • 输出格式:表格、清单、可贴 ticket 的 JSON、或带标签的段落,能直接粘到 Linear / Notion / Jira。
  • 示例 / 信号:1-2 份你欣赏的参考或竞品、加 1 个想避开的反例。

这套 Prompt 适合用在哪

  • 发版后评分下滑调查
  • 季度评分速率复盘
  • 上线前风险评估
  • 从真实用户痛点反推 roadmap
  • 关键 bug 燃尽优先级

15 个可直接复制的 Prompt 模板

1. 根因聚类(不是主题聚类)

核心模板。强迫按因果分组,不按表面词分组。

You are a product analyst. Below are {N} 1-2 star reviews of {app}. Cluster by ROOT CAUSE, not by topic. Same root cause may manifest as different complaints; same complaint may have different root causes. For each cluster: count, hypothesized root cause, 3 representative verbatim, suggested verification (logs, code area, recent release).

Reviews: {paste}

可替换变量: N、评论、App

优化建议: 聚类像主题聚类时追加:“Each cluster name must be a hypothesis ending in a verb (‘login flow regressed after auth refactor’), not a noun phrase (‘login issues’).“

2. 版本影响关联

Below are 1-2 star reviews for the last 90 days, with timestamps. Map them to our recent releases ({list with dates}). For each release: review count spike, dominant complaint, hypothesized regression. Identify any release that triggered a sustained spike.

Reviews: {paste}
Releases: {paste}

3. 崩溃 / 缺功能 / UX 摩擦分桶

Classify each of these 1-2 star reviews into: crash / data-loss, missing feature, UX friction, pricing complaint, support complaint, abuse / spam. For each bucket, count and % of total. Output a 6-row table with examples per bucket.

Reviews: {paste}

4. Persona × 根因矩阵

Below are reviews tagged with inferred persona (free / paid / new / power user). Cluster by root cause, then show distribution across personas. Highlight any root cause that disproportionately affects paid users — those move revenue.

Reviews: {paste}

5. “评分背后的故事”重建

For each of these 5 representative reviews, reconstruct the likely user story: what they were trying to do, where it broke, what they tried next, what made them rate 1 star. Mark each step with confidence level. This becomes empathy fuel for the team.

Reviews: {paste}

6. 严重度评分

For each root-cause cluster, score severity on 4 axes: (1) frequency of occurrence, (2) impact when it occurs (annoyance / blocker / data loss), (3) user segment affected, (4) recoverability. Output a 4-column severity table.

Clusters: {paste}

7. 修复优先级(可入 sprint)

From this analysis of 1-2 star reviews, produce the 5 fixes most likely to lift the rating in 8 weeks. For each: estimated effort, expected rating impact, dependencies, success metric. Mark any "fix" that is actually a comms issue (not a real bug).

Analysis: {paste}

8. 误判过滤

Some of these reviews report bugs that are not real bugs (user error, feature exists). For each review: classify as real bug / user error / feature exists / unclear. For "user error" and "feature exists", suggest a help-center or in-product fix.

Reviews: {paste}

9. 评分速率仪表盘

Design a 6-metric dashboard for rating velocity: avg rating last 7/30/90 days, % of reviews 1-2 star, time-to-respond, %-of-1-2-star with developer reply, % of repeat-complaint themes, post-release rating delta. Define each metric and its alarm threshold.

10. 慢性 vs 突峰模式

Below are 1-2 star reviews for the last 12 months. For each root cause cluster, classify as: chronic (consistent monthly), spike (concentrated weeks), seasonal (returns periodically). Recommend different response strategies for each pattern.

Reviews: {paste}

11. 本地化偏斜识别

Cluster these 1-2 star reviews by language / locale. For each locale: top 3 complaints. Highlight any locale where the dominant complaint is different from the global pattern — likely a localization or regional issue.

Reviews: {paste}

12. 竞品触发识别

Scan these 1-2 star reviews for mentions of competitor apps or "{competitor} is better at X". List each mention with context. Output: which competitors users compare us to, on what dimensions, with what frequency. This becomes positioning input.

Reviews: {paste}

13. 更新打破功能模式

Identify reviews complaining that an update made things worse. For each: which feature/flow they say regressed, when they noticed, whether they will downgrade if possible. Group by version. Recommend whether to roll back or fast-forward.

Reviews: {paste}

14. 每集群恢复动作清单

For each root cause cluster from this analysis, produce a recovery checklist: (1) immediate fix, (2) prevention work, (3) user comms (review reply template, in-app message, email), (4) PR risk level, (5) owner. Output as a per-cluster card.

Clusters: {paste}

15. 季度评分回顾

Write a quarterly retrospective: starting and ending rating, dominant 1-2 star themes per month, what we fixed, what we missed, what changed in rating velocity. End with 3 thematic bets for next quarter and 1 metric to declare them successful.

Quarter data: {paste}

容易踩的坑

  • 按主题聚(“登录问题”)而不是根因(“auth 重构后 iOS 17 OAuth 刷新失败”)。
  • 把一次发版引起的暴增当慢性问题。
  • 把用户误用直接当 bug 处理,未核实。
  • 忽略隐藏在全局计数里的本地化偏斜。
  • 只因一条激烈 1 星就动作,忽略整个集群。
  • 修了最大声的少数派抱怨,没核实是否真的代表多数。
  • 只修不沟通——修复重要,公开回复也重要。

优化技巧

  • 评论分析必配发版日期映射——评分跌幅多数能追到某次发版。
  • 按根因聚类不按主题聚类——这是最大杠杆。
  • 与支持工单交叉核对,趋同度提高置信度。
  • 每集群都标严重度 + 频率;二者都决定优先级。
  • 全局 vs locale 对比,区域问题藏在全局均值后面。
  • 修复发布后立即对每集群挑一条代表评论公开回复——参考用 AI 回 App Store 评论的方法,2 分钟一条把公开沟通这块跑起来。
  • 恢复期每周追踪评分速率,稳定后改月度。

FAQ

  • 至少多少条才能聚类?: 50 条起步才有意义。低于 50 逐条人工读。
  • 怎么区分 bug 暴增 vs UX 问题?: Bug 暴增关联发版日期;UX 问题跨版本持续。用模板 2 做映射。
  • 一条激烈评论值得动作吗?: 只在它描述的是别人可能默默踩到的明确 bug 时。否则等集群成形。
  • AI 能预测哪个修复最能提评分吗?: 能估,但真正指标是修复后 4 周的评分速率。要验证不要假设。
  • 评论互相矛盾怎么办?: 矛盾通常意味着两极化功能或某 segment 特有问题。用模板 4 拆开。

相关阅读

标签: #Prompt #产品创业 #App Store #App 审核