1-2 星评论几乎从不是表面那回事。用户写”iPhone 12 崩溃”,根因是登录流程回归。下面 15 个 Prompt 按根因聚类(不是主题)、关联到产品模块、区分一次性 bug 暴增与慢性模式、产出对齐 sprint 的修复优先级。给移动 App 团队管理评分速率用。
适合哪些场景
移动 App PM、App 工作室客服负责人、关注评分速率的增长团队、需要从糟糕版本恢复的创始人。
什么时候不建议这样写 Prompt
不到 50 条评论的新 App 不要用——逐条读。一次性喷子评论也别用,那是审核问题不是分析问题。
Prompt 结构公式
差评分析 Prompt 一定带这六个要素:
- 角色:让 AI 扮演谁(资深 PM / 独立创始人 / 产品设计师 / 独立开发者 / 增长负责人)。
- 上下文:阶段(想法 / MVP / 增长 / 规模化)、团队规模、流量或 ARR、平台(web / iOS / Android)、受众、限制。
- 目标:一个具体交付物——一段 PRD、一组用户故事、一个实验设计、一篇上线公告。
- 限制:时间线(本 sprint / 本季度)、要砍的范围、不能动的东西(现有流程、计费、合规)。
- 输出格式:表格、清单、可贴 ticket 的 JSON、或带标签的段落,能直接粘到 Linear / Notion / Jira。
- 示例 / 信号:1-2 份你欣赏的参考或竞品、加 1 个想避开的反例。
这套 Prompt 适合用在哪
- 发版后评分下滑调查
- 季度评分速率复盘
- 上线前风险评估
- 从真实用户痛点反推 roadmap
- 关键 bug 燃尽优先级
15 个可直接复制的 Prompt 模板
1. 根因聚类(不是主题聚类)
核心模板。强迫按因果分组,不按表面词分组。
You are a product analyst. Below are {N} 1-2 star reviews of {app}. Cluster by ROOT CAUSE, not by topic. Same root cause may manifest as different complaints; same complaint may have different root causes. For each cluster: count, hypothesized root cause, 3 representative verbatim, suggested verification (logs, code area, recent release).
Reviews: {paste}
可替换变量: N、评论、App
优化建议: 聚类像主题聚类时追加:“Each cluster name must be a hypothesis ending in a verb (‘login flow regressed after auth refactor’), not a noun phrase (‘login issues’).“
2. 版本影响关联
Below are 1-2 star reviews for the last 90 days, with timestamps. Map them to our recent releases ({list with dates}). For each release: review count spike, dominant complaint, hypothesized regression. Identify any release that triggered a sustained spike.
Reviews: {paste}
Releases: {paste}
3. 崩溃 / 缺功能 / UX 摩擦分桶
Classify each of these 1-2 star reviews into: crash / data-loss, missing feature, UX friction, pricing complaint, support complaint, abuse / spam. For each bucket, count and % of total. Output a 6-row table with examples per bucket.
Reviews: {paste}
4. Persona × 根因矩阵
Below are reviews tagged with inferred persona (free / paid / new / power user). Cluster by root cause, then show distribution across personas. Highlight any root cause that disproportionately affects paid users — those move revenue.
Reviews: {paste}
5. “评分背后的故事”重建
For each of these 5 representative reviews, reconstruct the likely user story: what they were trying to do, where it broke, what they tried next, what made them rate 1 star. Mark each step with confidence level. This becomes empathy fuel for the team.
Reviews: {paste}
6. 严重度评分
For each root-cause cluster, score severity on 4 axes: (1) frequency of occurrence, (2) impact when it occurs (annoyance / blocker / data loss), (3) user segment affected, (4) recoverability. Output a 4-column severity table.
Clusters: {paste}
7. 修复优先级(可入 sprint)
From this analysis of 1-2 star reviews, produce the 5 fixes most likely to lift the rating in 8 weeks. For each: estimated effort, expected rating impact, dependencies, success metric. Mark any "fix" that is actually a comms issue (not a real bug).
Analysis: {paste}
8. 误判过滤
Some of these reviews report bugs that are not real bugs (user error, feature exists). For each review: classify as real bug / user error / feature exists / unclear. For "user error" and "feature exists", suggest a help-center or in-product fix.
Reviews: {paste}
9. 评分速率仪表盘
Design a 6-metric dashboard for rating velocity: avg rating last 7/30/90 days, % of reviews 1-2 star, time-to-respond, %-of-1-2-star with developer reply, % of repeat-complaint themes, post-release rating delta. Define each metric and its alarm threshold.
10. 慢性 vs 突峰模式
Below are 1-2 star reviews for the last 12 months. For each root cause cluster, classify as: chronic (consistent monthly), spike (concentrated weeks), seasonal (returns periodically). Recommend different response strategies for each pattern.
Reviews: {paste}
11. 本地化偏斜识别
Cluster these 1-2 star reviews by language / locale. For each locale: top 3 complaints. Highlight any locale where the dominant complaint is different from the global pattern — likely a localization or regional issue.
Reviews: {paste}
12. 竞品触发识别
Scan these 1-2 star reviews for mentions of competitor apps or "{competitor} is better at X". List each mention with context. Output: which competitors users compare us to, on what dimensions, with what frequency. This becomes positioning input.
Reviews: {paste}
13. 更新打破功能模式
Identify reviews complaining that an update made things worse. For each: which feature/flow they say regressed, when they noticed, whether they will downgrade if possible. Group by version. Recommend whether to roll back or fast-forward.
Reviews: {paste}
14. 每集群恢复动作清单
For each root cause cluster from this analysis, produce a recovery checklist: (1) immediate fix, (2) prevention work, (3) user comms (review reply template, in-app message, email), (4) PR risk level, (5) owner. Output as a per-cluster card.
Clusters: {paste}
15. 季度评分回顾
Write a quarterly retrospective: starting and ending rating, dominant 1-2 star themes per month, what we fixed, what we missed, what changed in rating velocity. End with 3 thematic bets for next quarter and 1 metric to declare them successful.
Quarter data: {paste}
容易踩的坑
- 按主题聚(“登录问题”)而不是根因(“auth 重构后 iOS 17 OAuth 刷新失败”)。
- 把一次发版引起的暴增当慢性问题。
- 把用户误用直接当 bug 处理,未核实。
- 忽略隐藏在全局计数里的本地化偏斜。
- 只因一条激烈 1 星就动作,忽略整个集群。
- 修了最大声的少数派抱怨,没核实是否真的代表多数。
- 只修不沟通——修复重要,公开回复也重要。
优化技巧
- 评论分析必配发版日期映射——评分跌幅多数能追到某次发版。
- 按根因聚类不按主题聚类——这是最大杠杆。
- 与支持工单交叉核对,趋同度提高置信度。
- 每集群都标严重度 + 频率;二者都决定优先级。
- 全局 vs locale 对比,区域问题藏在全局均值后面。
- 修复发布后立即对每集群挑一条代表评论公开回复——参考用 AI 回 App Store 评论的方法,2 分钟一条把公开沟通这块跑起来。
- 恢复期每周追踪评分速率,稳定后改月度。
FAQ
- 至少多少条才能聚类?: 50 条起步才有意义。低于 50 逐条人工读。
- 怎么区分 bug 暴增 vs UX 问题?: Bug 暴增关联发版日期;UX 问题跨版本持续。用模板 2 做映射。
- 一条激烈评论值得动作吗?: 只在它描述的是别人可能默默踩到的明确 bug 时。否则等集群成形。
- AI 能预测哪个修复最能提评分吗?: 能估,但真正指标是修复后 4 周的评分速率。要验证不要假设。
- 评论互相矛盾怎么办?: 矛盾通常意味着两极化功能或某 segment 特有问题。用模板 4 拆开。