What this covers
Every image-to-video tool (Runway, Kling, Pika, Luma) will happily turn your hero shot into a 5-second clip - and just as happily morph the product, replace the face, or invent a third arm by second three. This guide is a battle-tested workflow for getting motion out of a still while keeping the reference recognizable: how to prep the input, how to phrase motion, how long to render, and how to stitch.
Who this is for
Anyone with a single image who needs motion: e-commerce hero shots that need to move, illustrator portraits that need a head turn, product photographers building 15-second ad cuts from one studio frame. No editing-suite experience required, but you should be able to crop and color-correct.
When to reach for it
When the image is the brief and motion is the deliverable. Not for: scenes that need to cut between multiple subjects (use text-to-video), abstract aesthetic clips (text-to-video again), or any shot where the product orientation must be pixel-accurate (use 3D render or a real camera).
Before you start
- Upscale the reference to at least 1536px on the long edge. Models hallucinate detail on small inputs, and the hallucinations are where drift starts.
- Clean the background. A messy background pulls focal attention and trains the model to “interpret” rather than “preserve.”
- Decide your motion class up front: camera move, subject move, environment move (wind, water), or VFX (glow, particles). Don’t mix on the first attempt.
- Lock aspect ratio in the reference itself - don’t ask the tool to crop or extend, you’ll lose subject framing.
Step by step
- High-res reference. 1536-2048px long edge, sharp, no compression artifacts. JPEG quality 90+ or PNG.
- Conservative motion strength. Most tools have a 0-10 dial. Start at 3-4 for products and faces, 5-6 for environments, 7+ only for stylized motion graphics.
- One-sentence motion description. “Subtle camera push-in, product remains centered, slight steam rising from the cup.” Specify both what moves and what stays.
- Short clips (2-4s). Drift compounds with duration. Render four 3-second takes, not one 12-second take.
- Stitch and color-grade for unity. Bring the clips into DaVinci Resolve / CapCut. Match color on the strongest clip, then conform the others.
Prompt template that ships
[Reference attached] Subtle [camera/subject/env] motion: [describe motion in 6-10 words].
Keep subject identity, scale, and framing identical to reference.
No new objects, no morphing, no parallax background.
Duration: 3s. Style: photographic, neutral grade.
The phrase “no new objects, no morphing” is doing real work - it nudges the latent space away from creative reinterpretation. Add specifically “no third hand, no extra finger” if you have a person in frame.
When to give up and reshoot
Some images simply will not animate cleanly. The fast indicators:
- Subject occluded by their own arm or hair on the reference - the model will rebuild what it can’t see and get it wrong.
- Multiple people in frame - identity drift on the secondary face is nearly guaranteed.
- Text on a label or signage - it will warp, almost always.
- Reflections (mirrors, glass, water) - they re-render and decouple from the subject.
If you see two of these on a single reference, render a 1-second test before committing budget.
Recommended workflow
reference (upscale + clean) -> motion class chosen -> conservative strength -> 3s clip -> render 4 takes -> pick best -> stitch -> color-match. Budget about 15-20 minutes per usable 10-second output. If you’re over 40 minutes, the reference is the problem, not the prompt.
FAQ
- Why does my product change shape? - Motion strength too high, or the reference is too small. Drop strength by 2, upscale, retry.
- Can I do 10s in one render? - Possible in newer Kling / Runway modes, but quality almost always degrades after 5s. Stitch shorter clips for cleaner results.
- What FPS should I render at? - 24fps for cinematic feel, 30fps for ad cuts that intercut with phone footage. Most tools default to 24.
- Do seeds help? - Yes - if your tool exposes seed, lock it once you find a take that nearly works, then iterate prompt only.
- How do I get a head turn without face drift? - Use a tool with explicit motion brush (Runway, Kling) and constrain the head only; leave the body unmasked or set to zero motion.
- Can I extend a good clip? - Most tools offer extend - but use it once. Two consecutive extensions usually break identity.
Common mistakes
- High motion + long clip = drift compounds; render shorter or lower strength.
- Vague motion (“make it move”) - the model picks the easiest motion, which is usually a slow zoom that flattens the subject.
- Mixed motion classes - asking for camera push, particle effects, and a head turn at once gives you all three poorly.
- Skipping the upscale step - low-res references hallucinate badly.
- Color-grading per clip instead of as a sequence - the cut will feel disjointed even when each shot is good.
- Trusting the first take - render 4, choose 1.
Related
Tags: #Tutorial #Image-to-video