Most pricing A/Bs are run with broken math. Revenue per user is censored at the test window, refund and churn windows extend past the test, the “winning” price ships, and three months later the team realizes they shipped the price with worse 90-day revenue. AI cannot fix bad data, but it can write a brief that names every place the math will get lossy — before you run the test.
The task
Produce a one-page pricing-experiment brief: hypothesis, price options, primary metric (revenue per new install on a clean window), unit-economics check, LTV sensitivity, and the exact way you will avoid lossy A/B math.
When this is the right job for AI
- You already have CAC, current ARPU, and a rough 90-day LTV number.
- You are testing 2-3 price points, not 7. (7 is a fishing trip, not an experiment.)
- You can describe your refund and churn windows in days.
- You have a non-tiny audience — at least 5k new conversions per arm over the window.
- You will resist the urge to ship the “winner” before the LTV horizon closes.
What to feed the AI
- Current price + 2-3 candidate prices (and the strategy behind each: penetration, premium, anchor)
- Current CAC by channel
- Current ARPU and 90-day LTV (or “we don’t know LTV yet” — AI handles both)
- Refund window (Apple 90 days, your decision for grace periods)
- Conversion baseline: free-to-paid rate
- Cohort size you can realistically allocate
- The decision horizon (“we ship in 8 weeks regardless”)
Copy-ready prompt
You are a pricing analyst writing a one-page experiment brief.
Current state:
- App: a one-tap habit tracker (consumer iOS, subscription).
- Current price: $4.99/mo, $39.99/yr.
- ARPU: $11.40 trailing 90 days.
- Estimated 90-day LTV: $14.60 (we have 14 months of data).
- CAC: $4.20 from organic, $9.80 from paid social.
- Free-to-paid: 3.1% in trial-to-paid (7-day trial).
Test design:
- Test prices: $4.99 (control), $6.99 (premium), $3.99 (penetration).
- Audience: new iOS installs only, en-US, ja-JP excluded (different pricing psychology).
- Decision horizon: 8 weeks.
- Refund/cancellation window: Apple's 90 days.
Output the brief in this exact order:
1. Hypothesis (per arm). Form: "At price X, free-to-paid will move from 3.1% to Y; ARPU on the conversion will move by Z; net revenue per new install will change by W."
2. Primary metric: revenue per new install (RPNI) measured on a fixed 28-day window from install. State why we use RPNI rather than conversion rate alone.
3. The four ways this experiment can produce lossy math, and how we will avoid each:
a) Refund window outlasts test window
b) Annual vs monthly mix differences between arms
c) Free-trial conversion timing differences
d) Selection bias from price-sensitive audiences
4. Unit-economics check. For each arm, the breakeven CAC. Mark the arm where paid social goes underwater.
5. LTV sensitivity. If LTV is actually 20% lower than estimate, which arm is still positive?
6. The result we would NOT ship even if it "wins" — and why.
Rules:
- No "consider." Each arm gets a clear go/no-go criterion.
- No invented numbers. Anything I did not provide, mark [need from finance].
- Call out underpowered cohorts in plain language.
- Max one page.
Sample output structure
Hypothesis (per arm). Premium $6.99: free-to-paid drops to 2.3% (-0.8 pp), ARPU on conversion rises 30%, RPNI moves +12% vs control. Penetration $3.99: free-to-paid rises to 4.0% (+0.9 pp), ARPU on conversion falls 18%, RPNI moves +4%. Control $4.99: baseline.
Primary metric. RPNI on a 28-day install-aligned window. Conversion rate alone is misleading — penetration can win conversion and lose revenue. RPNI captures both at the install level. [need from finance: confirm 28-day RPNI is the org-wide standard].
Lossy-math defenses. (a) Refunds: we hold the test analytics gate open for 90 days post-install; do not call the test on day 28 RPNI alone — confirm at day 60 before shipping. (b) Annual/monthly mix: report RPNI weighted by mix; if the premium arm pushes annual share by more than 10 pp, treat as a confound. (c) Trial timing: align comparison at trial-end day 7, not install day 0. (d) Selection bias: do not over-allocate to organic-only traffic.
Unit economics. Breakeven CAC: $3.99 arm = $4.10 (paid social $9.80 is underwater — do not run paid into this arm). $4.99 = $5.20 (paid social marginal). $6.99 = $6.95 (paid social healthy).
LTV sensitivity. If true LTV is $11.70 (20% lower), only the $6.99 arm stays positive on paid social. The $3.99 arm is negative even on organic if LTV slips 30%.
The result we would not ship. If $3.99 wins on free-to-paid but loses on RPNI by more than 8%, we do not ship it — even though “more paid users” looks great in a roadmap deck.
How to refine
- Hypothesis stops at conversion → require “each arm names RPNI direction, not just conversion.”
- LTV sensitivity skipped → demand “model -20% LTV, name which arm goes negative.”
- AI hand-waves refund window → require “explicit 90-day analytics gate.”
- “Winner” framing without horizon → enforce “no ship before day 60 confirmation.”
- Underpowered cohorts hidden → demand “name any arm with less than 5k conversions in the window.”
Common mistakes
- Calling the test at day 14 because the numbers look great. Day 14 is before the refund storm.
- Ignoring mix shifts. The premium arm often pushes annual share, which looks like RPNI lift but is just a timing shift.
- Running paid traffic against a penetration arm that is unit-negative on paid CAC.
- Comparing arms with different free-trial designs; only change price, not trial length.
FAQ
- Should I test annual price separately from monthly? Yes. Annual and monthly behave like different products; one A/B at a time.
- Can I just look at conversion rate? No. Conversion can rise while RPNI falls. RPNI is the floor metric.
- What if my data is too thin for RPNI? Use a leading proxy (trial-to-paid + ARPU on conversion) but commit to a 90-day RPNI readout before locking the price.
- Do I need a holdout? For pricing, yes — at least 10% on the current price for 6 months, so you can detect drift.
Related
- AI A/B Test Summary
- AI Pricing Hypothesis
- AI Pricing Page Copy
- AI App Experiment Design
- AI Retention Cohort Readout
Tags: #AI writing #Pricing #Experiment #app-product-ops #Indie dev