AI Video Face Changes Mid-Video

Face is fine at frame 1, different by frame 60. The model's identity coherence window is ~3-4 seconds — anchor with a reference image, shorten clips, drop motion.

You generate a 6-second clip of a person speaking. Frame 1 is the character you wanted. By frame 90 (~3 seconds in), it’s subtly someone else — same hair, similar clothing, but the eyes, nose, and jawline are different. By frame 150 (5 seconds), it’s clearly a different person.

This is identity drift specifically in the face — the most attention-grabbing region for human viewers, and the most sensitive to small changes. The model technically tracks identity from the previous frame, but it accumulates errors quickly on faces because there’s so much information per pixel.

Common causes

Ordered by hit rate, highest first.

1. No identity reference image

Pure text-to-video has nothing to anchor identity. Each frame is partly invention.

How to spot it: you’re using text-to-video without a start frame.

2. Clip exceeds the model’s identity-coherence window

Most video models maintain face identity well for ~3-4s, then drift compounds rapidly. 6s+ is asking for trouble.

How to spot it: face is fine at 2s, drifts at 5s+. Coherence window is exceeded.

3. Camera moves too much

Heavy camera motion (zoom in/out, fast pan) forces the model to re-derive the face from new angles each frame. Each re-derivation adds error.

How to spot it: clip has zooms, pans, dollies, or dutch angle changes.

4. Subject in motion (turning head, walking past camera)

Same as above but with subject motion. Profile shots and motion-heavy subjects drift fast.

How to spot it: subject turns head, walks across frame, or has fast emotional changes.

5. Low-resolution reference image

If the reference is only 512×512, the model has limited identity info to work from. Larger ref = more info to preserve.

How to spot it: your reference image is <1024×1024 or low quality.

6. Multiple subjects compete for identity attention

Two people in frame = model has to track both. Resources split, identity drifts on the secondary character.

How to spot it: clip has multiple characters; secondary one drifts more.

Shortest path to fix

Step 1: Generate ONE canonical reference and reuse it

# Spec for a good reference
- Front-facing or slight three-quarter
- Neutral expression (no exaggerated smile or frown)
- Even daylight lighting, no dramatic shadows
- ≥1024×1024 PNG
- Save as character_REFERENCE.png — DO NOT regenerate

Reuse for every clip; never replace mid-project.

Step 2: Set it as identity anchor in the tool

# Runway Gen-3 Alpha
- Image to video → upload start frame
- Optional: also upload end frame for stronger lock

# Kling 1.6
- "Image to video" mode → reference image
- Enable "Character coherence" if available

# Pika 2.0
- "Image input" slot → reference
- Enable "Lock identity"

# Hailuo / Luma
- "Reference image" upload
- Highest weight setting

Step 3: Cap clip length at 3 seconds

# Override defaults
- Runway: 4s → 3s (or shortest available)
- Kling: 5s → 3s if possible, or stop at 3s manually
- Pika: 3s base; don't extend

# Strategy for longer shots
1. Storyboard the action into 3s beats
2. Generate each 3s separately using the same reference
3. Stitch with matched end-start frames in editor

Step 4: Drop motion strength

# Runway: motion 5 → 3
# Pika: 0.6 → 0.4
# Kling: "intense" → "smooth"
# Luma: high → medium

Lower motion = slower identity drift.

Step 5: Frame face large + centered

# Best for identity stability
- Half-body or medium close-up
- Face takes up >25% of vertical frame
- Face mostly facing camera (no full profile)
- Avoid extreme angles

Step 6: Upscale + face-restore in post if drift is mild

If drift is only slight and you’ve already invested in the clip:

# Face-restore tools
- Topaz Video AI: Face Recovery mode
- GFPGAN / CodeFormer (open-source, runs locally)
- Use original reference image as identity target if tool supports it

# Process
1. Run clip through Topaz Video AI with Face Recovery
2. Use the reference image as the target identity
3. Tool re-paints face frame-by-frame matching the reference

Prevention

  • Always start a character project with one high-quality reference image
  • Default to 3s clips for face-critical shots; longer = drift
  • Plan camera moves to be subtle; reserve dramatic camera for non-face shots
  • Run a final face-restore pass in Topaz on hero shots before delivery

Tags: #Video generation #Debug #Troubleshooting