The task
You’re holding your phone, ring light on, B-roll edited and ready, and you’re about to record the talking-head portion of a TikTok about cutting your AI subscription stack. You know the first 1 second of the final video is what decides — TikTok’s auto-skip data shows about 35-40% of viewers swipe within that window, and another 25% by second 3. Your last 8 TikToks averaged 28% 3-second retention; the data screams the issue is the hook, not the body. You want 6 hook + first-cut combinations to record against the same B-roll, tagged by pattern family so you can A/B which pattern actually stops the scroll in your niche.
Where AI helps — and where it does not
AI knows hook pattern families — counter-intuitive, stake, number, admission, open-loop, mid-action — and can produce variants that fit TikTok’s specific 1-second cognitive load (visual cue + 4-7 words of text + verbal opener landing simultaneously). It can also avoid the patterns that have become invisible on TikTok in 2026: “wait until you see this,” “POV:”, “tell me you don’t [X] without telling me,” “this is your sign.”
What AI cannot do: know which pattern is fatigued in your specific niche. The same counter-intuitive opener that works on finance TikTok dies on beauty TikTok. Pull the 3-second retention numbers from your last 20 videos, identify the weakest performing hook style, and tell the model to rotate away from it. AI also cannot judge whether your hook over-promises against your actual video — if 3-second retention is high but 15-second retention crashes, your hook lied and viewers left.
A specific failure mode: AI gravitates to verbal hooks alone and leaves the on-screen text and first B-roll cut as afterthoughts. On TikTok, all three signals must land in the first 1 second — captions-off viewers (about 50%) need the on-screen text to do the work, and the B-roll is what stops the thumb mid-swipe.
What to feed the AI
- The video’s actual payoff in one sentence (what the viewer walks away with, concretely)
- Your niche + the one hook style you’ve noticed is fatigued there (e.g., “POV: you walk into a Sephora” on beauty TikTok)
- Format — on-camera (your face in the first frame) or voiceover-only (B-roll only)
- Your 3-second and 15-second retention from your last 5-10 videos (the gap between them tells you whether your hook over-promises)
- Your top 2 best-performing hooks ever (model will pattern-match the structure)
- The 3-second retention floor you’re trying to beat (your average) and the ceiling worth aiming for (your top 10%)
- The B-roll you’ve already edited (so the model can match the first cut to footage you actually have)
- Whether captions are auto-generated or you burn them in (burned-in earns more retention; auto runs lighter and faster)
Copy-ready prompt
Write 6 TikTok hooks designed for the 1-second decision window.
Video payoff (concrete takeaway in one sentence): {paste}
Niche + the one hook style fatigued there: {paste}
Format: {on-camera / voiceover only}
My 3s retention recent average: {%}. My 15s retention: {%}.
Top 2 best-performing hooks I've ever posted: {paste}
B-roll I've already shot/edited: {brief description}
For each of 6 hooks, return:
1) On-screen text (4-7 words max — captions-off viewers will only see this).
2) First spoken line (≤10 words — this lands by second 1).
3) First B-roll cut idea — a specific frame that pairs with the words.
4) Pattern tag — counter-intuitive / stake / number / admission / open-loop / mid-action.
Rules:
- All three signals (text + voice + cut) must land within 1 second.
- Avoid the fatigued style I specified, and avoid: "wait until you see," "POV:", "tell me you [X] without telling me," "this is your sign," any superlative ("incredible," "shocking," "you won't believe").
- For each hook, write a second 5-word "test variant" — same pattern, different sentence — so I can record two takes against the same B-roll and A/B.
- Confirm the on-screen text fits comfortably in the top third of the frame (TikTok's comment overlay covers the bottom).
End with a one-line note: which of the 6 hooks is most likely to beat my {%} 3-second retention floor and why.
Shorter variant — single hook rapid iteration
Below is a hook that hit {x}% 3s retention. Rewrite into 5 variants that fix what's likely making it weak. Each: 5-word on-screen text + 10-word VO + 1-frame B-roll cut. Pattern: {one specific pattern family}.
Original hook: {paste}
Sample output
A useful 3-signal combo (counter-intuitive pattern):
- On-screen text: “You are paying $20 too much.”
- First spoken line: “If you use ChatGPT Plus and still pay for one of these tools — stop.”
- First B-roll cut: hand canceling a subscription on a phone screen, close-up.
- Pattern tag: counter-intuitive + stake.
- Test variant: “$20/month you can stop paying.”
A useful stake-pattern hook:
- On-screen text: “Most AI tools double-bill you.”
- First spoken line: “Your AI tool stack is almost certainly doubled up — here’s how to check in 90 seconds.”
- First B-roll cut: side-by-side of two app icons with overlapping feature lists highlighted.
- Pattern tag: stake.
- Test variant: “You’re paying twice for the same thing.”
A useful “predicted winner” note: “Hook 1 most likely beats your 28% retention floor. The on-screen text contains a number and a stake, both proven retainers in your niche; the B-roll matches a tactile action (canceling) that holds the thumb. Hook 4 (admission pattern) is the riskier bet — high upside if your audience hasn’t seen admission-pattern hooks recently, lower if they have.”
How to refine
- Cap on-screen text at 5 words: “If any hook’s on-screen text is over 5 words, rewrite. Captions-off viewers have to read it in under 1 second; 5 words is the realistic ceiling. 4 is better.”
- All three signals land at once: “Re-check each combo: on-screen text, spoken line, and first B-roll cut should all communicate the same direction within the first second. If the spoken line takes 3 seconds to land but the on-screen text is fast, the slower one is the bottleneck.”
- Match cut to footage I have: “Re-read the B-roll cuts. If any cut requires footage I don’t have, replace with one from my actual edited footage. I won’t reshoot for a hook variant.”
- Drop fatigued patterns: “Re-check against the fatigued styles I specified, plus the generic 2026 dead patterns. If any variant uses ‘wait until you see,’ ‘POV:’, or any superlative, rewrite.”
- Predict the winner with reasoning: “End the output with a 1-line prediction: which of the 6 hooks most likely beats my retention floor, citing the pattern + the niche signal. Without a prediction, A/B testing is uncalibrated.”
Common mistakes
- “Wait until you see this,” “POV:”, “tell me you [X] without telling me” — universally fatigued on TikTok in 2026; the algorithm reads them and viewers pre-scroll
- Hooks that take 3 seconds to read on-screen — your text must fit in the 1-second decision window; 4-7 words max, 4 is better
- No B-roll plan paired with the hook — voice alone doesn’t stop scrolls on TikTok; the visual is what holds the thumb
- Hook on-screen text covered by the comment overlay — TikTok’s bottom UI covers the lower 20% of the frame; put hook text in the top third
- Promising what the video doesn’t deliver — high 3s retention with crashing 15s retention is the signature; viewers leave and the algorithm punishes the next post
- Same hook pattern on every video — algorithm reads signal staleness; rotate pattern families across posts even if one pattern is currently working
- Recording only one take of the hook — always batch-record 2 hook variants for the same body edit; A/B reveals what your audience actually rewards
- Ignoring captions-off viewers — about 50% of TikTok watches happen with sound off or muted; the on-screen text must carry the hook standalone
FAQ
- Should I always front-load the punchline on TikTok?: Yes. TikTok rewards watch-time and completion; viewers who don’t get value in the first 3 seconds don’t return. Save curiosity-driven open loops for longer formats (YouTube, podcasts) where you’ve already earned attention.
- How many hooks should I A/B in a week?: 2 different pattern families per content theme. Less is noise (no signal in 2 weeks). More is fatigue (your audience pattern-matches you across all hook styles). Run for 2 weeks, then rotate.
- What if my 3-second retention is high but 15-second crashes?: Your hook over-promised. Rewrite for honesty — the hook should set up something the body actually delivers. AI is good at this; tell it “the hook must be true to the video’s payoff; do not write a hook that sets up something the video doesn’t deliver.”
- Burned-in captions vs. auto-generated?: Burned-in earn measurably more retention (custom font, timing, emphasis), but cost ~10 minutes per video to make. For posts you care about, burn in. For daily volume, auto-generated is enough.
- The model keeps suggesting fatigued patterns — what changes?: Add: “These patterns are 2026-dead and forbidden: ‘wait until you see,’ ‘POV:’, ‘tell me you [X] without telling me,’ ‘this is your sign,’ any superlative (‘incredible,’ ‘shocking’). If any variant uses these, rewrite. The goal is plain but unexpected — not clickbait register.”