A trailer that does not pull tension reads as a montage. Sora, Veo, and Kling will gladly hand you 30 cool clips with nothing connecting them; that is the AI default. This tutorial gives you a 45-second trailer with a real arc: setup that promises a question, escalation that raises the stakes, and a button that makes the viewer want to know more. The structural decisions matter more than the prompt craft. Get the arc right and even mediocre shots feel cinematic; get the arc wrong and the best AI footage still feels random.
What this covers
A 3-act trailer structure tuned for AI clip lengths: 15s setup, 25s escalation, 5s button. Shot-budgeting per act, motion grammar per act, sound design per act, and a final color pass that locks the world together. Tools: Sora and Veo for narrative shots, Kling for high-motion or stylized cuts, any editor with audio.
Who this is for
Indie filmmakers proving a concept before pitching, founders making teaser videos for a launch, content creators building IP arcs across short formats, and ad teams making a 45-second hero piece for a campaign.
When to reach for it
Concept trailers and pitch reels, product launch teasers, short-film festival entries, IP-bible video bibles, recruiter / hiring teasers for a company brand, and seasonal campaign teasers.
Before you start
- Write the one-sentence question the trailer poses. If the viewer cannot say “I want to know what happens”, the structure is broken.
- Decide tone first. Thriller, hopeful, comedic, mysterious — different tones use different motion grammar and different cut pacing.
- Pick the visual world: time period, palette, location family. Trailers that change worlds across cuts feel like reels, not trailers.
- Decide the button. The last shot has to imply more — a door opening, a face turning, a line of dialogue cut off. Without a button, the trailer ends, it does not finish.
Step by step
- Lay the 45 seconds as three acts on the timeline before generating anything: 0-15s setup, 15-40s escalation, 40-45s button. Label them.
- Storyboard shot counts per act: setup uses 3-4 longer shots (3-5s each, slow motion), escalation uses 8-12 faster shots (1.5-3s each, rising motion), button is 1-2 shots (2-3s each, suspended motion).
- Write prompts with motion energy in mind, not just composition. Setup shots: “slow dolly, long lens, sustained motion”. Escalation shots: “handheld energy, faster moves, dynamic cuts”. Button: “stillness, locked-off frame, one moving element”.
- Generate per act, not per shot. Doing all setup shots in one session keeps the visual world coherent; jumping between acts mid-generation invites style drift.
- Sound carries half the tension. Use a single drone/pulse layer through setup, percussive hits during escalation, and a hard silence + breath into the button. Music alone is not enough — sound design is.
- Color-grade the trailer as one piece, not per clip. Setup slightly desaturated and cool, escalation richer and warmer, button stripped back. Color is part of the arc.
First-run exercise
- Pick an existing IP you know well (a book, a podcast concept, a personal project). Write the one-sentence question and the button shot.
- Storyboard 8 shots minimum across the three acts. Sketch on paper; do not start generating yet.
- Generate the setup act first (3 shots). If the world does not feel like one place across all three, regenerate before moving on.
- Add sound design before color. Bare timing edit, then drone + hits + silence, then color. Each layer reveals what the previous layer was missing.
Quality check
- The viewer can name the question after one watch. If they cannot, the setup is too vague or too cryptic.
- Escalation actually escalates. Cut pace shortens, motion energy rises, sound thickens. If the middle plays at the same energy as the setup, the arc is flat.
- The button works as a stop, not just an end. Closing on stillness, a turn, a held breath — anything that signals there is more.
- Visual world is one place, one time. Cross-cuts to other worlds must be obviously intentional (flashback grading, different palette).
- Sound design is layered, not just music. Drone, foley hits, breath, silence. Music alone is a music video.
How to reuse this workflow
- Save the 3-act timeline template (15/25/5 with empty placeholders). New project drops shots into the same skeleton.
- Build a sound-design library: 3-5 drones, 5-10 percussive hits, 2-3 breaths, one strong silence-cue. Reusable across trailers.
- Keep a per-tone color preset: thriller LUT, hopeful LUT, comedic LUT. Apply once across the timeline, never per clip.
- Re-test the model lineup every 4-6 weeks. Sora handles narrative drama better some weeks; Veo handles human shots better others; Kling pushes stylized motion. Trailer quality benefits most from picking per shot.
Recommended workflow
One-sentence question + tone → 3-act timeline laid down → 8-14 shot storyboard sized per act → generate by act in single sessions for world coherence → assemble timing first, no audio → layer sound design (drone, hits, silence) → grade as one piece → final mix → export 16:9 hero + 9:16 social cut.
Common mistakes
- Generating all the cool shots first, then trying to find a structure. The structure has to exist before the shots, or it never will.
- Same cut pace across the whole trailer. Escalation requires accelerating cuts; sustained pace is a montage.
- Music doing the work alone. Music is the floor; sound design and silence are the ceiling.
- No button. Trailers that end on a cool shot feel finished — trailers that end on a button feel ongoing.
- Mixing visual worlds without intent. Random shifts break trust; deliberate shifts (grading, palette, time) build it.
- Skipping the per-act generation discipline. Generating shot 1, then shot 7, then shot 3 invites style drift between cuts that have to feel connected.
FAQ
- Sora, Veo, Kling — pick one?: Pick per shot. Sora and Veo handle narrative shots better; Kling handles fast motion and stylized worlds. Generate by act in one tool when you can, but mix is fine.
- How long should generating take?: Setup 30 min, escalation 60-90 min (most shots), button 30 min. Plus 60 min for sound and color. Budget a half day for a serious 45-second trailer.
- Music or original score?: Suno gets you to 80% on score for free. For a published trailer, license a track or hire a composer; the difference reads.
- Aspect ratio?: 16:9 for film festivals and YouTube, 9:16 for social teasers. Generate once at 16:9 if you can crop comfortably; otherwise generate twice.
- Can I tell a story in 45 seconds?: Pose a story, not tell one. The trailer’s job is to make the viewer want the full version.