A 1-2 star review is rarely about what it says on the surface. The user wrote “crashes on iPhone 12” but the root cause was a login-flow regression. These 15 prompts cluster negative reviews by root cause (not by topic), tie them to product areas, separate one-time bug bursts from chronic patterns, and produce a fix-priority list that maps to your sprint. Designed for mobile app teams managing rating velocity.
Who this is for
Mobile app PMs, support leads at app studios, growth teams tracking rating velocity, and founders trying to recover from a bad release.
When not to use these prompts
Skip for new apps under 50 reviews — read each one. Skip too for one-off troll reviews; those need moderation, not analysis.
Prompt anatomy / structure formula
A negative-review analysis prompt should always carry six elements:
- Role: who the AI plays (senior PM / solo founder / product designer / indie dev / growth lead).
- Context: stage (idea / MVP / growth / scale), team size, traffic or ARR, platform (web / iOS / Android), audience, constraints.
- Goal: one concrete deliverable — one PRD section, one user-story set, one experiment design, one launch post.
- Constraints: timeline (this sprint / this quarter), scope cuts, must-not-break (existing flows, billing, compliance).
- Output format: table, checklist, ticket-ready JSON, or labeled blocks you can paste straight into Linear / Notion / Jira.
- Examples / signal: 1-2 reference docs or competitors you like, plus 1 anti-example you want to avoid.
Best for
- Post-release rating-drop investigation
- Quarterly rating-velocity review
- Pre-launch risk assessment
- Roadmap input from real-user pain
- Critical-bug burn-down prioritization
15 copy-ready prompt templates
1. Root-cause cluster (not topic cluster)
The core template. Forces causal grouping, not surface-word grouping.
You are a product analyst. Below are {N} 1-2 star reviews of {app}. Cluster by ROOT CAUSE, not by topic. Same root cause may manifest as different complaints; same complaint may have different root causes. For each cluster: count, hypothesized root cause, 3 representative verbatim, suggested verification (logs, code area, recent release).
Reviews: {paste}
Variables to swap: N, reviews, app
Optimization: If clusters look like topic clusters, add: “Each cluster name must be a hypothesis ending in a verb (‘login flow regressed after auth refactor’), not a noun phrase (‘login issues’).“
2. Release-impact correlation
Below are 1-2 star reviews for the last 90 days, with timestamps. Map them to our recent releases ({list with dates}). For each release: review count spike, dominant complaint, hypothesized regression. Identify any release that triggered a sustained spike.
Reviews: {paste}
Releases: {paste}
3. Crash vs feature-miss vs UX-friction split
Classify each of these 1-2 star reviews into: crash / data-loss, missing feature, UX friction, pricing complaint, support complaint, abuse / spam. For each bucket, count and % of total. Output a 6-row table with examples per bucket.
Reviews: {paste}
4. Persona × root-cause matrix
Below are reviews tagged with inferred persona (free / paid / new / power user). Cluster by root cause, then show distribution across personas. Highlight any root cause that disproportionately affects paid users — those move revenue.
Reviews: {paste}
5. “Story behind the rating” reconstruction
For each of these 5 representative reviews, reconstruct the likely user story: what they were trying to do, where it broke, what they tried next, what made them rate 1 star. Mark each step with confidence level. This becomes empathy fuel for the team.
Reviews: {paste}
6. Severity scoring
For each root-cause cluster, score severity on 4 axes: (1) frequency of occurrence, (2) impact when it occurs (annoyance / blocker / data loss), (3) user segment affected, (4) recoverability. Output a 4-column severity table.
Clusters: {paste}
7. Fix-priority list (sprint-ready)
From this analysis of 1-2 star reviews, produce the 5 fixes most likely to lift the rating in 8 weeks. For each: estimated effort, expected rating impact, dependencies, success metric. Mark any "fix" that is actually a comms issue (not a real bug).
Analysis: {paste}
8. False-claim filter
Some of these reviews report bugs that are not real bugs (user error, feature exists). For each review: classify as real bug / user error / feature exists / unclear. For "user error" and "feature exists", suggest a help-center or in-product fix.
Reviews: {paste}
9. Rating-velocity dashboard
Design a 6-metric dashboard for rating velocity: avg rating last 7/30/90 days, % of reviews 1-2 star, time-to-respond, %-of-1-2-star with developer reply, % of repeat-complaint themes, post-release rating delta. Define each metric and its alarm threshold.
10. Chronic vs spike pattern detector
Below are 1-2 star reviews for the last 12 months. For each root cause cluster, classify as: chronic (consistent monthly), spike (concentrated weeks), seasonal (returns periodically). Recommend different response strategies for each pattern.
Reviews: {paste}
11. Localization-skewed pain detection
Cluster these 1-2 star reviews by language / locale. For each locale: top 3 complaints. Highlight any locale where the dominant complaint is different from the global pattern — likely a localization or regional issue.
Reviews: {paste}
12. Competitor-trigger detection
Scan these 1-2 star reviews for mentions of competitor apps or "{competitor} is better at X". List each mention with context. Output: which competitors users compare us to, on what dimensions, with what frequency. This becomes positioning input.
Reviews: {paste}
13. Update-broke-things review pattern
Identify reviews complaining that an update made things worse. For each: which feature/flow they say regressed, when they noticed, whether they will downgrade if possible. Group by version. Recommend whether to roll back or fast-forward.
Reviews: {paste}
14. Recovery-action checklist per cluster
For each root cause cluster from this analysis, produce a recovery checklist: (1) immediate fix, (2) prevention work, (3) user comms (review reply template, in-app message, email), (4) PR risk level, (5) owner. Output as a per-cluster card.
Clusters: {paste}
15. Quarterly rating retrospective
Write a quarterly retrospective: starting and ending rating, dominant 1-2 star themes per month, what we fixed, what we missed, what changed in rating velocity. End with 3 thematic bets for next quarter and 1 metric to declare them successful.
Quarter data: {paste}
Common mistakes
- Clustering by topic (“login problems”) instead of root cause (“auth refactor broke OAuth refresh on iOS 17”).
- Mistaking a review spike caused by one release for a chronic problem.
- Treating user-error reports as bugs without verification.
- Ignoring localization-skewed patterns hidden in global counts.
- Acting on a single passionate 1-star review instead of the cluster.
- Fixing the dominant complaint without checking if it is just the most VOCAL minority.
- Skipping comms recovery — the fix matters but the public reply matters too.
How to push results further
- Always pair review analysis with release-date mapping — most rating drops trace to a specific release.
- Cluster by root cause not topic; this is the single biggest lever.
- Cross-reference review themes with support tickets — convergence increases confidence.
- Tag every cluster with severity AND frequency; both matter for prioritization.
- Compare global vs locale-specific patterns; regional issues hide in global averages.
- Reply publicly to a representative review per cluster as soon as a fix ships — see how to reply to App Store reviews with AI for a 2-minute-per-reply workflow that scales the public-comms part.
- Track rating velocity weekly during recovery, monthly when stable.
FAQ
- How many reviews do I need to cluster?: A minimum of 50 for meaningful root-cause clusters. With under 50, read each one individually.
- How do I tell a bug spike from a UX problem?: Bug spikes correlate to release dates; UX problems persist across releases. Use template 2 to map.
- Should I act on a single passionate review?: Only if it describes a clear bug others might hit silently. Otherwise wait for the cluster.
- Can AI predict which fix will lift the rating most?: It can estimate, but the right metric is post-fix rating velocity in the 4 weeks after release. Verify, do not assume.
- What if reviews contradict each other?: Contradiction usually means a polarizing feature or segment-specific issue. Use template 4 (persona matrix) to disentangle.