The task
Mixpanel (or Amplitude, or your warehouse) is showing the retention cohort table: 12 weekly cohorts down, 12 weeks across, color-coded from red to dark green. Your CEO walked past your screen and said “so, is retention getting better?” You have 90 seconds before they ask again, the chart genuinely looks “kind of better in some columns and weirdly bad in row 4,” and you do not want to give the wrong directional answer to a question that defines next quarter’s roadmap. You want a 3-sentence readout that says what is actually happening, plus one outlier worth investigating before someone else builds a strategy on it.
Where AI helps — and where it does not
AI is good at pattern recognition across the rows and columns you describe: direction of week-1 retention across cohorts, whether the curve flattens at the same week or shifts, which cohort is genuinely an outlier vs. just statistical noise. It is also good at producing the disciplined 3-sentence format that survives an exec’s glance, where most analysts default to a 10-bullet wall.
What AI cannot do: see your raw data, so you must transcribe the table accurately, ideally with weeks in the column header. It also cannot diagnose root causes; correlation with a feature ship date or pricing change is a hypothesis, not a finding. And it cannot tell you whether a 3-point W1 lift is statistically significant. Small cohorts are noisy, and the model will confidently call a 24% → 27% swing “improvement” when it might be coin flips.
A specific failure mode: AI defaults to reporting “improvement” whenever the most recent cohort beats the oldest cohort, ignoring the noisy middle. Tell it explicitly: “report direction across the trailing 6 cohorts, not first vs. last; and call out cohort-to-cohort noise where it exists.”
What to feed the AI
- The cohort table values (paste as a markdown grid, or describe row by row; model is fine with either)
- The cohort axis: weekly, monthly, or by signup source (this changes what counts as a real cohort)
- Which metric the cells represent (D1, W1, M1, or “any-action retention”; model is fine with each but needs to know)
- The cohort sizes: a 47% retention number on a 30-user cohort is not the same signal as on a 3000-user cohort
- Dated events that might explain shifts: feature ships, pricing changes, ad campaigns, launches, infrastructure changes
- The question leadership actually asked, in their words (“is retention getting better” vs. “why is the AppSumo cohort still alive”; completely different readout)
- Any seasonality you already know about (holiday, end-of-quarter, school year)
- The acceptable confidence level: “directional is fine” vs. “I need to put this in the board deck”
Copy-ready prompt
Read this retention cohort table and write a 3-sentence readout for {leadership / PM team / board}.
Cohort axis: {weekly / monthly / by signup source}
Metric in cells: {D1 / W1 / M1 / any-action retention}
Cohort sizes: {paste or describe}
Table values (cohorts as rows, weeks as columns):
{paste markdown grid}
Dated events (with dates): {paste}
Known seasonality: {paste or "none"}
The exact question leadership asked: "{quote}"
Return exactly 3 sentences:
1) Direction of early-window retention (W1 or M1) across the trailing 6 cohorts — name the specific number range, not "improving." Flag if cohort-to-cohort noise is high relative to the trend.
2) Long-tail shape — does the curve flatten, and at which week? Compare the W1-to-W8 (or M1-to-M6) gap across cohorts: widening or narrowing?
3) The single most interesting outlier cohort — name the cohort, the number, and the most likely event explanation. Be explicit that this is hypothesis, not finding.
End with one line: "Next chart I would pull: {specific chart}" — the chart that would test the hypothesis from sentence 3.
Do not call a < 3-point swing on a < 200-user cohort "improvement"; flag it as within noise.
Shorter variant — single-line slack answer
Cohort table: {paste}. Leadership asked: "{quote}". Write a one-line answer with the specific number and direction, no caveats. Then a second line with the one cohort I should actually look at this week.
Sample output
A useful 3-sentence readout: “Week-1 retention has lifted from 24-27% to 30-33% across the last 6 cohorts — directional improvement, with cohort-to-cohort noise of about 3 points. All cohorts flatten by week 4 and the W1-to-W8 gap has not narrowed; what we’ve fixed is the cold start, not the long-term hold. The cohort from Aug 15 shows 47% W1 on 412 users — that lines up with the AppSumo deal, and these users are still over-represented in week-8 active users, which suggests deal-driven users are surviving longer than expected, not that overall retention shifted.”
A useful “next chart” line: “Next chart I would pull: per-cohort week-4 to week-8 retention only, to confirm the long-tail is genuinely flat across the recent improvement and not lagging.”
A useful Slack-version one-liner: “W1 retention is up ~6 points across the last 6 cohorts (24-27 → 30-33), but the long-tail looks the same — we fixed onboarding, not stickiness. Aug 15 cohort is the one to look at this week.”
How to refine
- Force specific numbers: “Replace any phrase like ‘recent cohorts have improved’ or ‘long-tail looks stable’ with the actual number range — e.g., ‘W1 moved from 24-27% to 30-33% across the last 6 cohorts.’ If the model can’t be specific, the data doesn’t support the claim.”
- Make the long-tail check explicit: “Compute the W1-to-W8 gap (or M1-to-M6) per cohort. Tell me whether the gap widens or narrows over the trailing 6 cohorts; that is the long-tail finding.”
- Flag noise honestly: “If the trend swing is smaller than typical cohort-to-cohort noise, say so. ‘Directional improvement within noise’ is a valid finding; ‘improvement’ alone overclaims.”
- Tie outlier to event, not feature credit: “Name the outlier cohort with both its number and its likely event correlation. State explicitly that this is hypothesis. Do not claim feature X caused the lift unless we shipped only one thing in that window.”
- Match the readout to the audience: “For board, 3 sentences and one chart. For PMs, add the per-cohort table and the next-chart line. For the data team, return the markdown table with my interpretation underneath, not in place of, the data.”
Common mistakes
- Reading week-1 movement and stopping: long-tail tells a different story and is what matters for LTV
- Comparing the oldest cohort to the newest: the noise in the middle is often the actual signal; trailing 6 is the standard window
- Ignoring outlier cohorts: they usually have the most signal about what is actually driving acquisition mix
- Confusing cohort size with cohort quality: a 47% W1 on 30 users is not better than a 30% W1 on 3000 users; small-n noise dominates
- Cherry-picking the prettiest cohort for the deck: if leadership notices the surrounding cohorts later, you lose credibility on every readout that follows
- Claiming feature credit without isolation: if you shipped 3 things in the same week as a retention lift, attribute carefully; “associated with” beats “caused by”
- Calling a 2-point swing “improvement”: anything inside typical cohort-to-cohort noise is not a finding; it is a coin flip
- Pasting only the table without events or cohort sizes: model will pattern-match the numbers but miss the actual story
FAQ
- Should I share the full cohort table with leadership?: Technical audience (PMs, eng leads, data team) yes. For execs and board, share the 3-sentence readout plus one visual (usually a small-multiples chart of the curve shapes, or one heat-map with the trailing 6 cohorts only). The full grid trains them to skim.
- How many cohorts before trends are real?: For weekly cohorts: 6+ before you trust direction, 12+ before you trust long-tail shape changes. For monthly: 3+ for direction, 6+ for long-tail. Below those thresholds, report as “directional, within noise.”
- Why does the W1-to-W8 gap matter more than W1?: W1 reflects onboarding and first-day value. The W1-to-W8 ratio reflects whether the product creates a habit. You can move W1 with a better welcome email; moving W1-to-W8 requires actual product fit.
- The model says retention improved but my gut says no. What’s going on?: Usually the model is comparing first vs. last cohort and missing the noisy middle. Add: “compare trailing 6 cohorts against the prior 6 cohorts and call out variance, not just direction.”
- What if I have multiple acquisition sources mixed in?: Split the cohort by source before asking for a readout. A mixed-source cohort table will hide whatever is really happening; the underlying sources usually move in opposite directions and cancel.
Related
- AI funnel analysis readout
- AI chart takeaway
- Excel / Spreadsheet Analysis Prompts
- AI KPI narrative
- AI dashboard takeaway
Tags: #AI writing #Data analysis #Workflow #Retention #Cohort