AI Market Sizing Tutorial: TAM/SAM/SOM From Top-Down + Bottom-Up

Build a defensible TAM/SAM/SOM with AI doing the legwork — and a triangulation step that catches the made-up numbers.

Every pitch deck has a TAM slide. Most of them are wrong. Asking an AI for “the TAM of X” gives you a confident eight-figure number that traces back to a 2022 Statista blurb cited by a 2024 Medium post. A useful market-sizing workflow uses AI to pull both top-down and bottom-up estimates, then triangulates them — and the gap between the two is where the real argument lives. This tutorial walks the loop investors actually want to see.

What this covers

A two-pass sizing workflow: top-down (start from a large public number and narrow with reasoned filters) and bottom-up (start from unit economics and multiply up). Both with AI. Then a triangulation step that compares the two, names the gap, and forces you to pick the model you can defend. Output is a one-slide TAM / SAM / SOM with the assumption table beneath it.

Who this is for

Founders sizing a market for a seed deck, PMs writing a “should we build this” memo, strategy teams scoping a new vertical, and investors who want to sanity-check a deck before a meeting. Not for: pure academic market reports — those need primary survey data, not AI synthesis.

When to reach for it

When you have a defined product or category, a target geography, and 90 minutes to produce a credible sizing slide. Skip this workflow when the market is so emergent there are no public numbers — then you are bottom-up only, and the AI does much less. For broader category context first, run an AI industry research workflow pass.

Before you start

  • Define the unit you are sizing. “Annual revenue from X tool sold to Y persona in Z geography.” Vague unit, vague number.
  • Decide TAM versus SAM versus SOM upfront. TAM = global category, SAM = reachable subset, SOM = realistic 3-5 year share. Most decks conflate them.
  • Pick your engines: Perplexity for cited public numbers, ChatGPT or Claude Deep Research for synthesis, a spreadsheet for the actual math. AI does not do arithmetic — you do.
  • Have one bookmarked credible source for the broad category. You will use it as the anchor for top-down.

Step by step

  1. Top-down: pull the anchor number with Perplexity. “Global market size for [category] in 2025-2026 — list 3-5 estimates from named sources with publication dates.” Read the spread. If the high and low estimates differ by 5x, the category is poorly measured — note that, do not paper over it.
  2. Apply reasoned filters to get from the anchor to your TAM. Geography (US only? Multiply by a published US share — never guess). Segment (SMB only? Filter by employee bucket). Use case (paid users only? Filter by conversion benchmarks). Each filter is a row in your assumption table with a source link.
  3. Bottom-up: start from unit economics. Average revenue per user multiplied by addressable user count multiplied by adoption rate. Use Perplexity to pull each input separately: “Average ARPU for [SaaS category] SMB tier in 2026.” “Number of SMBs in [geography] with [employee range].” “Adoption rate of [category tools] in [segment].”
  4. Do the arithmetic in a sheet, not in the chat. AI math fails silently on multi-step multiplication. Put every input in its own cell with a source URL in the next column. Total at the bottom.
  5. Triangulate. Compare top-down TAM and bottom-up TAM. If they are within 2x, you have a defensible range. If they differ by 10x, one of the models is wrong — usually because a filter is over-claimed or an ARPU is stale. For deeper synthesis on this kind of analytic crosscheck, see the ChatGPT research tutorial.
  6. Write the SAM and SOM as derivations of the triangulated TAM. SAM = TAM filtered by your actual go-to-market (channel reachable, language, regulation). SOM = SAM multiplied by a defensible 3-year share assumption (cite a comparable company’s share trajectory).
  7. One slide, assumption table beneath. TAM / SAM / SOM as three boxes, the assumption table as a small font table beneath with every input, its value, its source, and the year of the source.

First-run exercise

Pick a market you have first-hand knowledge of — your current employer’s, or a hobby industry where you know real ARPU. Run the workflow end-to-end. Compare the AI-derived number to your gut. Most first runs come back 2-5x too high — usually because the top-down anchor includes adjacent categories. That is the calibration lesson; you will tighten the unit definition on every future run.

Quality check

  • Top-down and bottom-up numbers are within 2x. If not, name the gap explicitly on the slide.
  • Every input cell has a source URL with a 2024-2026 publication date. Older sources need a footnote justifying why they still apply.
  • SAM is meaningfully smaller than TAM. If SAM equals TAM, you have not actually filtered for go-to-market reachability.
  • SOM has a named comparable. “We will reach 5 percent” with no comparable is not a defense.
  • The assumption table fits on one slide. If it does not, you are oversizing the model.

How to reuse this workflow

  • Save the top-down filter list and bottom-up input list as a template. New product, new numbers, same skeleton.
  • Build a personal database of recurring inputs: ARPU benchmarks per category, SMB counts per geography, adoption rates per segment. Each one with its source. Update quarterly.
  • Re-run the sizing every six months on the same template. Sizing that does not update goes stale by the next funding round.

Unit definition → top-down anchor → reasoned filters → bottom-up inputs → spreadsheet math → triangulation → SAM and SOM as derivations → one-slide output with assumption table. Plan 90 minutes for the first pass, 30 minutes for refreshes.

Common mistakes

  • Asking the AI directly for “the TAM of X.” You get a number, not an argument. Investors want the argument.
  • Doing the math in chat. Multi-step arithmetic with AI is unreliable; use a sheet.
  • Skipping the triangulation step — top-down alone overstates, bottom-up alone understates, the truth lives in the gap.
  • Citing stale sources without a footnote. A 2022 number used unmodified in a 2026 deck is a red flag.
  • Conflating TAM and SAM. Investors notice immediately.
  • Hiding the assumption table because it makes the model look “less clean.” The table is the model’s credibility.

FAQ

  • What if there is no public number for my category?: Bottom-up only. Be explicit about it on the slide. Investors prefer “this is bottom-up only because the category is too new” over “TAM is $10B with no traceable source.”
  • Which AI engine for sizing?: Perplexity for input retrieval, Claude or ChatGPT for synthesis, a spreadsheet for math. Do not collapse the steps.
  • How fresh do sources need to be?: 2024 or later for input numbers, 2025-2026 preferred for ARPU and adoption rates.
  • Should I include growth projections?: Yes, as a separate row, with a named source for the growth rate. Do not bake projected growth into the TAM number itself.
  • What is a “defensible” share assumption for SOM?: One backed by a comparable company’s actual share trajectory at the equivalent stage. Without a comparable, your SOM is a wish.

Tags: #market-sizing #tam #strategy #Tutorial