Should I always front-load the punchline on TikTok?

Yes. About 71% of viewers decide whether to keep watching inside the first 3 seconds, and TikTok rewards watch-time and completion. Viewers who don't get value up front don't return. Save curiosity-driven open loops for longer formats (YouTube, podcasts) where you've already earned attention.

What 3-second retention should I aim for?

Below 60% gets suppressed; 65-70% is the rough minimum for any organic push; 70-85% is a strong hook (~2.2x the views of a sub-60% video) and 85%+ is elite (~2.8x). Beat your own recent average first, then chase the top of that band.

How many hooks should I A/B in a week?

2 different pattern families per content theme. Less is noise (no signal in 2 weeks). More is fatigue (your audience pattern-matches you across all hook styles). Run for 2 weeks, then rotate.

What if my 3-second retention is high but 15-second crashes?

Your hook over-promised. Rewrite for honesty — the hook should set up something the body actually delivers. AI is good at this; tell it "the hook must be true to the video's payoff; do not write a hook that sets up something the video doesn't deliver."

Which model should I use to write hooks?

Any frontier model handles this; the prompt does the heavy lifting. ChatGPT (GPT-5.5) and Claude (Sonnet 4.6) both produce tight, on-brief variants; Gemini 3.1 Pro is fine too. Free tiers are enough for one batch a day — the constraint is your retention data, not the model.

Burned-in captions vs. auto-generated?

Burned-in earn measurably more retention (custom font, timing, emphasis), but cost ~10 minutes per video to make. For posts you care about, burn in. For daily volume, auto-generated is enough. Either way, keep text in the top third — TikTok's bottom UI eats the lower ~320px of the frame.

AI Use Cases

Write TikTok Hooks That Stop the Scroll With AI

Generate 6 TikTok hook combos — 4-7 word on-screen text + 10-word spoken line + first B-roll cut — built for the 3-second decision window and tagged by pattern family for A/B testing.

Published: May 17, 2026 Updated: Jun 09, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

TL;DR

A TikTok hook is not one line — it’s three signals firing inside the first 3 seconds: on-screen text (4-7 words), the first spoken line (≤10 words), and the first B-roll cut. About 71% of viewers decide whether to keep watching inside those 3 seconds, and roughly 80% scroll with sound off, so the text and the cut have to carry the hook on their own. This page gives you a copy-ready prompt that produces 6 hook combos tagged by pattern family (counter-intuitive, stake, number, admission, open-loop, mid-action) so you can A/B which pattern stops the scroll in your niche — plus what to feed the model and how to refine the output. Works with ChatGPT (GPT-5.5), Claude (Sonnet 4.6), or Gemini 3.1 Pro.

The task

You’re holding your phone, ring light on, B-roll edited and ready, and you’re about to record the talking-head portion of a TikTok about cutting your AI subscription stack. You know the first 3 seconds of the final video decide everything: as of mid-2026, about 71% of viewers make the stay-or-scroll call inside that window. Videos that hold 70-85% of viewers through those 3 seconds earn roughly 2.2x more total views than videos that fall below 60%, and the rough floor for the algorithm to push a video at all sits near 65-70% 3-second retention. Your last 8 TikToks averaged 28% 3-second retention; the data screams the issue is the hook, not the body. You want 6 hook + first-cut combinations to record against the same B-roll, tagged by pattern family so you can A/B which pattern actually stops the scroll in your niche.

Where AI helps — and where it does not

AI knows hook pattern families — counter-intuitive, stake, number, admission, open-loop, mid-action — and can produce variants that fit TikTok’s specific cognitive load in the opening seconds (visual cue + 4-7 words of text + verbal opener landing together). It can also avoid the openers that have gone invisible on TikTok by mid-2026: “wait until you see this,” “POV:,” “tell me you don’t [X] without telling me,” “this is your sign,” “have you ever wondered,” and creator-first greetings like “hey guys, welcome back” (which test at the bottom of every hook ranking — generic, slow, and about the creator instead of the viewer).

What AI cannot do: know which pattern is fatigued in your specific niche. The same counter-intuitive opener that works on finance TikTok dies on beauty TikTok. Pull the 3-second retention numbers from your last 20 videos, identify the weakest performing hook style, and tell the model to rotate away from it. AI also cannot judge whether your hook over-promises against your actual video — if 3-second retention is high but 15-second retention crashes, your hook lied and viewers left.

A specific failure mode: AI gravitates to verbal hooks alone and leaves the on-screen text and first B-roll cut as afterthoughts. On TikTok all three signals must land in the first 3 seconds — and because roughly 80% of social-video views happen with sound off, the on-screen text and B-roll have to carry the hook standalone. The B-roll is what stops the thumb mid-swipe; the text is what the muted majority actually reads.

3-second retention: what good looks like

The whole exercise is calibrated against one number — your 3-second retention. Use these mid-2026 benchmarks to set your floor and ceiling before you write a single hook.

3-second retention	What it means	Algorithm signal
Below 60%	Weak hook; most thumbs already gone	Suppressed; little to no push
65-70%	Minimum viable	The rough floor for any organic distribution
70-85%	Strong hook	~2.2x more total views than sub-60% videos
85%+	Elite hook	~2.8x more total views than sub-60% videos

Your “floor to beat” is your own recent average; your “ceiling worth aiming for” is the top of this band. Feed both to the model.

What to feed the AI

The video’s actual payoff in one sentence (what the viewer walks away with, concretely)
Your niche + the one hook style you’ve noticed is fatigued there (e.g., “POV: you walk into a Sephora” on beauty TikTok)
Format — on-camera (your face in the first frame) or voiceover-only (B-roll only)
Your 3-second and 15-second retention from your last 5-10 videos (the gap between them tells you whether your hook over-promises)
Your top 2 best-performing hooks ever (model will pattern-match the structure)
The 3-second retention floor you’re trying to beat (your average) and the ceiling worth aiming for (your top 10%)
The B-roll you’ve already edited (so the model can match the first cut to footage you actually have)
Whether captions are auto-generated or you burn them in (burned-in earns more retention; auto runs lighter and faster)

Copy-ready prompt

Write 6 TikTok hooks designed for the first-3-second decision window.
Video payoff (concrete takeaway in one sentence): [paste]
Niche + the one hook style fatigued there: [paste]
Format: [on-camera / voiceover only]
My 3s retention recent average: [%]. My 15s retention: [%].
Top 2 best-performing hooks I've ever posted: [paste]
B-roll I've already shot/edited: [brief description]

For each of 6 hooks, return:
1) On-screen text (4-7 words max — sound-off viewers will only see this).
2) First spoken line (10 words max — this lands by second 1).
3) First B-roll cut idea — a specific frame that pairs with the words.
4) Pattern tag — counter-intuitive / stake / number / admission / open-loop / mid-action.

Rules:
- All three signals (text + voice + cut) must land inside the first 3 seconds.
- Avoid the fatigued style I specified, and avoid: "wait until you see," "POV:," "tell me you [X] without telling me," "this is your sign," "have you ever wondered," "hey guys welcome back," any superlative ("incredible," "shocking," "you won't believe").
- For each hook, write a second 5-word "test variant" — same pattern, different sentence — so I can record two takes against the same B-roll and A/B.
- Keep the on-screen text in the top third of the frame. TikTok's bottom UI (caption, sound, like/comment/share) covers roughly the lower 320px of a 1920px-tall frame, and the right ~120px is the button column — so center the text high and away from the right edge.

End with a one-line note: which of the 6 hooks is most likely to beat my [%] 3-second retention floor and why.

Shorter variant — single hook rapid iteration

Below is a hook that hit [x]% 3s retention. Rewrite into 5 variants that fix what's likely making it weak. Each: 5-word on-screen text + 10-word VO + 1-frame B-roll cut. Pattern: [one specific pattern family].

Original hook: [paste]

Sample output

A useful 3-signal combo (counter-intuitive pattern):

On-screen text: “You are paying $20 too much.”
First spoken line: “If you use ChatGPT Plus and still pay for one of these tools — stop.”
First B-roll cut: hand canceling a subscription on a phone screen, close-up.
Pattern tag: counter-intuitive + stake.
Test variant: “$20/month you can stop paying.”

A useful stake-pattern hook:

On-screen text: “Most AI tools double-bill you.”
First spoken line: “Your AI tool stack is almost certainly doubled up — here’s how to check in 90 seconds.”
First B-roll cut: side-by-side of two app icons with overlapping feature lists highlighted.
Pattern tag: stake.
Test variant: “You’re paying twice for the same thing.”

A useful “predicted winner” note: “Hook 1 most likely beats your 28% 3-second retention floor and has the best shot at the 65-70% minimum-viable band. The on-screen text contains a number and a stake, both proven retainers in your niche; the B-roll matches a tactile action (canceling) that holds the thumb. Hook 4 (admission pattern) is the riskier bet — high upside if your audience hasn’t seen admission-pattern hooks recently, lower if they have.”

How to refine

Cap on-screen text at 5 words: “If any hook’s on-screen text is over 5 words, rewrite. Sound-off viewers have to read it in about a second; 5 words is the realistic ceiling. 4 is better.”
All three signals land at once: “Re-check each combo: on-screen text, spoken line, and first B-roll cut should all communicate the same direction inside the first 3 seconds. If the spoken line takes the full 3 seconds to land but the on-screen text is fast, the slower one is the bottleneck.”
Match cut to footage I have: “Re-read the B-roll cuts. If any cut requires footage I don’t have, replace with one from my actual edited footage. I won’t reshoot for a hook variant.”
Drop fatigued patterns: “Re-check against the fatigued styles I specified, plus the generic 2026 dead patterns. If any variant uses ‘wait until you see,’ ‘POV:’, or any superlative, rewrite.”
Predict the winner with reasoning: “End the output with a 1-line prediction: which of the 6 hooks most likely beats my retention floor, citing the pattern + the niche signal. Without a prediction, A/B testing is uncalibrated.”

Common mistakes

“Wait until you see this,” “POV:,” “tell me you [X] without telling me,” “have you ever wondered” — universally fatigued on TikTok by 2026; viewers pattern-match and pre-scroll
Hooks that take longer than a beat to read on-screen — your text must register inside the 3-second decision window; 4-7 words max, 4 is better
No B-roll plan paired with the hook — voice alone doesn’t stop scrolls on TikTok; the visual is what holds the thumb
Hook on-screen text covered by the bottom UI — TikTok’s caption, sound and engagement buttons sit in the lower ~320px of a 1920px frame (and the right ~120px is the button column); put hook text in the top third
Promising what the video doesn’t deliver — high 3s retention with crashing 15s retention is the signature; viewers leave and the algorithm punishes the next post
Same hook pattern on every video — the algorithm reads signal staleness; rotate pattern families across posts even if one pattern is currently working
Recording only one take of the hook — always batch-record 2 hook variants for the same body edit; A/B reveals what your audience actually rewards
Ignoring sound-off viewers — roughly 80% of social-video views happen muted; the on-screen text must carry the hook standalone

FAQ

Should I always front-load the punchline on TikTok?: Yes. About 71% of viewers decide whether to keep watching inside the first 3 seconds, and TikTok rewards watch-time and completion. Viewers who don’t get value up front don’t return. Save curiosity-driven open loops for longer formats (YouTube, podcasts) where you’ve already earned attention.
What 3-second retention should I aim for?: Below 60% gets suppressed; 65-70% is the rough minimum for any organic push; 70-85% is a strong hook (~2.2x the views of a sub-60% video) and 85%+ is elite (~2.8x). Beat your own recent average first, then chase the top of that band.
How many hooks should I A/B in a week?: 2 different pattern families per content theme. Less is noise (no signal in 2 weeks). More is fatigue (your audience pattern-matches you across all hook styles). Run for 2 weeks, then rotate.
What if my 3-second retention is high but 15-second crashes?: Your hook over-promised. Rewrite for honesty — the hook should set up something the body actually delivers. AI is good at this; tell it “the hook must be true to the video’s payoff; do not write a hook that sets up something the video doesn’t deliver.”
Which model should I use to write hooks?: Any frontier model handles this; the prompt does the heavy lifting. ChatGPT (GPT-5.5) and Claude (Sonnet 4.6) both produce tight, on-brief variants; Gemini 3.1 Pro is fine too. Free tiers are enough for one batch a day — the constraint is your retention data, not the model.
Burned-in captions vs. auto-generated?: Burned-in earn measurably more retention (custom font, timing, emphasis), but cost ~10 minutes per video to make. For posts you care about, burn in. For daily volume, auto-generated is enough. Either way, keep text in the top third — TikTok’s bottom UI eats the lower ~320px of the frame.
The model keeps suggesting fatigued patterns — what changes?: Add: “These patterns are 2026-dead and forbidden: ‘wait until you see,’ ‘POV:,’ ‘tell me you [X] without telling me,’ ‘this is your sign,’ ‘have you ever wondered,’ ‘hey guys welcome back,’ any superlative (‘incredible,’ ‘shocking’). If any variant uses these, rewrite. The goal is plain but unexpected — not clickbait register.”

External: TikTok Creator Academy for the platform’s own retention and hook guidance.

Tags: #AI writing #Social media #Workflow #TikTok

TL;DR

The task

Where AI helps — and where it does not

3-second retention: what good looks like

What to feed the AI

Copy-ready prompt

Shorter variant — single hook rapid iteration

Sample output

How to refine

Common mistakes

FAQ

Related

Related Articles

Triage Comments and DMs With AI: A Reply System

AI Comment-Reply Style Guide: One Voice Across Every Reply

Build a Monthly Content Calendar with AI (June 2026 Workflow)

Write a Creator Collaboration Pitch With AI

Write an Instagram Carousel Script With AI

AI LinkedIn Thinking Post Prompt (Shares, Not Cringe)