You fed Runway / Kling / Pika a clean reference image — your character, your product, your scene — and the first frame of the generated clip looks great. By frame 30 the face has shifted, the outfit color has drifted, the product silhouette has changed. By frame 120 you are looking at a different person or product entirely. Image-to-video drift is the single most reported issue in 2025-2026 video generation. Fix it with the right combination of motion strength, clip length, and explicit identity anchors.
Common causes
Ordered by what causes drift most often.
1. Motion strength too high
Every image-to-video model has a knob that controls how much movement to add. Runway calls it “Motion Brush strength” or “Camera Motion” intensity; Pika has a 0-4 motion slider; Kling has “subtle/medium/intense” presets. Set too high, the model invents motion that requires inventing new geometry, and identity collapses.
How to spot it: Re-run at the lowest motion setting. If the drift drops dramatically, motion strength was the culprit.
2. Clip longer than identity coherence window
Each model has a “coherence window” — how many frames it can hold the subject before identity drifts. For 2025-2026 models:
- Runway Gen-3 Alpha: ~80 frames (~3.3s at 24fps) before noticeable drift
- Kling 1.6: ~96 frames (~4s) with high subject coherence mode
- Pika 1.5: ~72 frames (~3s) without identity anchor
- Sora: ~120 frames (~5s) for tight close-ups, less for full-body
Request a 10s clip and you are guaranteed to exceed the window.
3. Reference image too low resolution
If the reference image is 512x512 or has heavy JPEG compression, the model is interpreting blurry edges as semantic ambiguity (“is that a collar or a scarf?”) and resolving them differently every frame. The result reads as drift.
How to spot it: Open the reference image at 100%. Are edges crisp? Any compression artifacts? File size under 500KB for a 1024px image suggests heavy compression.
4. Prompt contradicts reference
Reference image shows a blonde woman; prompt says “young woman with auburn hair.” The model has two conflicting signals and resolves them inconsistently across frames.
How to spot it: Read your prompt next to the reference. Any attribute named in the prompt that does not match the image? That is a fight.
5. Subject too small in reference
If the subject occupies less than 30% of the reference image, the model has limited identity anchor data to work from and drifts faster.
6. Multiple subjects in reference
Two or more people / objects in the reference, and the model can swap which one it tracks across frames. Group reference images are the highest-risk case.
Before you change anything
- Save the reference image, full prompt, motion settings, and the drifting output clip.
- Note which model and tier you are on (Pika 1.5 vs 1.0, Runway Gen-3 Alpha vs Turbo).
- Decide your target clip length and how much identity drift is acceptable for the use case (B-roll tolerates more than hero shots).
- Confirm the reference image is at least 1024px on the short side and crisp.
- Commit or back up the current reference image and prompt before changing them.
Information to collect
- Reference image at native resolution, full prompt, motion strength, clip length.
- Model name and version.
- A side-by-side of the first frame vs the drifted frame to quantify the gap.
- Whether the same reference produces drift on a different model.
- Final-cut requirement: hero, B-roll, or background — different tolerances apply.
Shortest path to fix
Step 1: Re-export the reference at native resolution
Make sure the reference is at least 1024px on the shortest side, saved as a PNG (not JPEG), with the subject centered and clearly visible. Crop out background clutter, watermarks, or text overlays. The reference is the most important variable; under-investing here makes every other step harder.
For people: head and shoulders or chest-up framing, neutral pose. For products: clean background, single object, no reflections from other objects.
Step 2: Set motion strength to the lowest preset
- Runway: Motion Brush strength 1-2, Camera Motion “static” or “slow”
- Pika: motion slider at 0.3-0.5, not 1.5+
- Kling: “subtle” preset
- Sora: shortest duration
Then regenerate. If identity holds, dial up gradually. Most drift cases are solved here.
Step 3: Cap clip length at 3 seconds
Generate 3-second clips, then concatenate. Each 3s segment can use the previous segment’s last frame as the next segment’s reference image, preserving identity across the full sequence.
Clip A: image-to-video (reference = original image, 3s)
Export last frame of Clip A as image
Clip B: image-to-video (reference = last frame of A, 3s)
Concatenate in CapCut / Premiere
This “chained reference” workflow gets you to 10-20s of coherent output that single-shot generation cannot.
Step 4: Add explicit identity description to the prompt
Even with a reference image, add a text description that names the subject:
the same blonde woman from the reference image, red leather jacket,
slight head turn, no camera movement, identity preserved across frames
For products:
the same red ceramic mug from the reference, rotating slowly on its axis,
shape and color preserved, no morphing
This dual-anchor approach (image + text) significantly reduces drift.
Step 5: Switch to a model with stronger identity preservation
If drift persists at lowest motion + shortest clip + sharpened reference, the model itself is the bottleneck. As of 2025-2026:
- For human identity: Kling 1.6 “high subject coherence” mode
- For product identity: Runway Gen-3 with Motion Brush locked to background only
- For full-scene preservation: try Sora at shortest tier
Step 6: Use Runway Motion Brush or Kling reference lock
Both Runway and Kling expose a “lock subject” or “motion brush” feature where you paint the area that should stay still, and only the painted area drifts. For talking-head shots, paint the body and only allow head motion.
How to confirm the fix
- Compare frame 1 and the last frame side-by-side. The subject should be recognizably the same.
- Watch the clip at 25% speed. Any frame-to-frame jumps in face, color, or shape are drift.
- Three clips generated at the same settings should all hold identity, not just one lucky output.
- A teammate seeing only the final clip (no reference) should be able to match it back to the reference image.
If it still fails
- Reduce the clip to 2 seconds and re-run at the lowest motion setting. If 2s still drifts, the reference image itself is the problem.
- Try a much more constrained prompt:
static shot, minimal motion, identity preservedand dial out all camera moves. - Use a different reference image of the same subject — sometimes a different angle or framing produces dramatically better coherence.
- Switch to a fundamentally different model.
- Package the reference, prompt, motion settings, and the drifted clip before posting to community channels.
Prevention
- Always start at the strictest motion setting and loosen up only after you verify identity holds.
- Standardize reference image format: 1024-1536px, PNG, neutral background, single subject.
- For any clip over 3s, plan as a chain of 3s segments, not one long generation.
- For brand or product video, lock identity with both reference image AND text description naming key attributes.
- Maintain a per-model “coherence window” doc so you do not request longer clips than the model can hold.
Related
- AI video motion jitter
- AI video subject morphing
- AI image to video not following
- ChatGPT prompt improvement
- Refactor prompts
Tags: #Prompt #Debug #Troubleshooting #Video generation #Image-to-video