Write the Narrative Around Your KPI Movement With AI

Move from 'activation is up 4 points' to 'here is what likely caused it, here is what is still unknown, and here is what data would resolve the ambiguity' — without overclaiming.

The task

Monday morning standup. Activation jumped 4 points week-over-week (12% → 16%) and the CEO drops in the #growth Slack: “why?” You have three candidate drivers — the A/B test variant B rolled to 100% on Tuesday, marketing’s pricing-page rewrite went live Wednesday, and there is a known seasonal lift for your category in early March. You also have a fourth thing you do not want to surface yet: a competitor had an outage Thursday that may have pushed traffic to you. You need a narrative by 11am that gives the CEO a usable answer — what is the most likely cause, what alternatives are still in play, and what data would resolve the ambiguity — without claiming causation you cannot defend.

Where AI helps — and where it does not

AI is genuinely good at structuring a calibrated narrative — naming the most-likely cause with a confidence level, listing the alternative explanations not yet ruled out, and proposing the follow-up data that would tip the balance. It also disciplines the language away from “X drove Y” toward “X is consistent with Y, with these caveats.” Where AI fails: actually proving causation. It cannot run a regression for you, cannot pull segment data, and cannot tell you that the competitor’s outage matters unless you tell it the outage happened. Feed it every candidate driver and every piece of counter-evidence you already know about; the more you feed, the less it overclaims.

A common failure mode: the model picks one cause confidently and writes the narrative as if it is settled. That is the political error that lets your team take a victory lap for the A/B test when the lift was actually the pricing page. Force the prompt to require at least 2 alternative explanations and at least 1 confidence-lowering note.

What to feed the AI

  • The KPI before/after numbers with the exact time window — week-over-week, month-over-month, year-over-year are very different stories
  • All candidate drivers with their dates — every launch, campaign, feature, copy change, ad spend shift, external event, holiday, seasonality
  • Counter-evidence you already know exists — segments where the lift did not show up, the cohort that should have moved but did not, the metric that should have correlated but did not
  • The audience for the narrative — leadership, peer team, board; calibration shifts with audience
  • The decision the narrative supports — “should we accelerate the A/B test rollout” or “should we double ad spend” produces different framing
  • Your prior belief — what you would have bet caused the move before doing the analysis (so the model can call out your own confirmation bias)
  • The honest “what we don’t know” list — segments you have not pulled, time windows you have not compared, sources you have not checked
  • A “do not claim” list — things you suspect but cannot defend (the competitor outage, the bot traffic, the dashboarding bug)

Copy-ready prompt

Write a calibrated KPI movement narrative.

KPI + time window: {before, after, exact dates}
Candidate drivers with their dates: {paste all — launches, campaigns, features, ad spend, external events, seasonality}
Known counter-evidence: {paste any segment / cohort / correlation that does not fit the obvious story}
Audience for this narrative: {leadership / peer team / board}
Decision the narrative supports: {what we are trying to decide}
My prior belief: {what I would have bet caused the move}
What we do not yet know: {segments / windows / sources not yet checked}
Do-not-claim list: {things suspected but not defensible — competitor outage, bot traffic, dashboard bug}

Return:
1) One-line headline — what moved, by how much, in what window. Number-first.
2) Most-likely cause with a confidence level (low / medium / high) and a one-sentence explanation of why this confidence level, not higher and not lower.
3) At least 2 alternative explanations not yet ruled out — for each, the data that would rule it in or out.
4) The follow-up data I should pull next, ranked by which would most reduce ambiguity. Be specific (the exact segment, time window, metric to compare).
5) Recommended action with the time horizon: invest more now / wait one more week for confirmation / dig deeper before deciding.
6) The "what we are NOT claiming" list — items from my do-not-claim list, framed as honest uncertainty, not omission.

Tone: calibrated, plain, no marketing words ("significant," "phenomenal," "alarming"). Use "is consistent with" not "caused"; use "tracks with" not "drove." If confidence is low, the headline should say so. Force at least one confidence-lowering note even on a clean story.

Shorter variant — single-question audit

A teammate's narrative claim: {paste claim}.
Underlying data: {paste relevant numbers}.
Audit:
1) What confidence level does the data actually support?
2) Name 2 alternative explanations the claim does not address.
3) What follow-up data would either confirm or kill the claim?
4) Rewrite the claim with calibrated language.

Sample output

A calibrated headline: “Activation up 4pp WoW (12% → 16%), week of Mar 4. Medium confidence the onboarding A/B variant B caused most of the lift.”

A useful confidence rationale: “Confidence is medium, not high, because three things moved in the same week: the A/B test rollout (Tue), the pricing page rewrite (Wed), and a seasonal early-March lift we have seen in 2024 and 2025 at +1.5pp. The A/B variant B’s lift in the test phase (held-out at 50%) was 3.2pp, which matches most of the observed 4pp move — but the pricing page may also account for part of it.”

A useful alternative-not-ruled-out: “Alternative still in play: the pricing-page rewrite (Wed) may have raised the quality of incoming signups, not the activation step. We would see this in the trial-to-paid conversion 7 days out, not in the activation number. Pull Mar 11 trial-to-paid data on Tuesday to disambiguate.”

A useful “not claiming” line: “We are not claiming the competitor’s Thursday outage drove signup quality up; we noticed it but the timing (Thu late afternoon) does not align cleanly with the Tue rollout, and we have not pulled traffic-source data to confirm.”

A useful follow-up rank: “Highest value to pull next: (1) Activation by traffic-source segment — did the lift come from paid or organic? This separates the A/B test (which affects all signups equally) from the pricing page (which mostly affects organic). (2) Trial-to-paid on the Mar 4 cohort at the 7-day mark. (3) Activation by device — mobile vs desktop tells us if v2 onboarding mobile fix mattered.”

How to refine

  • If the narrative confidently picks one cause: “Name 2 reasons your top-pick driver might be wrong. Add them as ‘confidence-lowering’ notes in the narrative. If you cannot name 2, the confidence level is overstated.”
  • If it dodges with ‘inconclusive’: “Force-rank the candidates by probability, even if uncertain. ‘Inconclusive’ is not a narrative; ‘A is the most likely but we cannot rule out B and C’ is.”
  • If the language overclaims causation: “Replace every ‘X caused Y,’ ‘X drove Y,’ ‘X is responsible for Y’ with ‘is consistent with,’ ‘tracks with,’ ‘aligns with.’ Causation requires either a controlled experiment or a regression we have not run.”
  • If the follow-up data is vague: “Each follow-up data ask must name the exact segment, time window, and metric to compare. ‘Pull more data’ is not a follow-up.”
  • If the ‘not claiming’ list is missing: “Add the honest uncertainty section. Things we suspect but cannot defend belong in the narrative as ‘not claiming,’ not omitted. Omission reads as cherry-picking when discovered later.”

Common mistakes

  • Reporting correlation as causation: the most common political error in KPI narratives; the A/B test “drove” the lift only if the held-out cohort did not also move.
  • Single-cause stories: real KPI movements usually have 2-4 drivers; the narrative that picks one and ignores the others is wrong half the time and undefendable the other half.
  • Skipping the “what would resolve this” section: leaves the team with a story but no next data step; the narrative without a follow-up plan is gossip.
  • Numeric confidence levels without a model: “37% confident” reads precise but is fictional unless you actually ran a probability calculation; low/medium/high is more honest.
  • Burying the alternative explanations at the bottom: readers stop at line 2; alternatives belong in line 3, not paragraph 4.
  • Using marketing words: “significant,” “phenomenal,” “alarming” all signal you are managing the audience’s emotions rather than reporting; calibrated language is more credible.
  • Not naming the team owner of each candidate driver before publishing: surprising the marketing team with “your pricing page may have caused the lift” in a CEO Slack thread is the wrong order; share with owners first.
  • Forgetting the segment cut: almost every KPI movement has a segment story underneath; a narrative without segment exploration reads as the average story that hides the real one.

FAQ

  • How specific should confidence levels be?: Low / medium / high is the right grain for narratives without a formal model. Numeric confidence (37%) signals false precision; reserve numeric confidence for narratives backed by a regression or simulation.
  • Should I share the narrative widely?: Share with the team owners of each candidate driver first; they can confirm or kill alternatives faster than the broader audience. Once their inputs are in, share the consolidated narrative.
  • What if the data genuinely is inconclusive?: Write the inconclusive narrative honestly. “We do not yet know what caused the move, here are the 3 candidates, here is the data we are pulling next, expect an update by Friday.” Inconclusive done well is more credible than confident done wrong.
  • How long should the narrative be?: A Slack post: 4-6 lines. A weekly memo: 200-300 words. A board-deck section: one slide with 5 bullets. The shape changes; the structure (headline / cause + confidence / alternatives / follow-up data / recommendation) stays.
  • Should I revisit the narrative once the follow-up data lands?: Yes — publicly, with the same audience. Updating a narrative with new data builds long-term credibility; ignoring follow-up data destroys it.

Tags: #AI writing #Data analysis #Workflow #KPI