You’ve written the same character description ten ways — 28-year-old woman with shoulder-length dark hair, brown eyes, small nose, light freckles — and you’ve gotten ten different women. None of them are the same person. Same prompt, same model, different face every run.
Text alone cannot describe one specific person uniquely. There are millions of women who fit that description. To lock identity, you need a visual anchor (reference image, LoRA, or character embedding) plus deterministic settings (fixed seed + same model + same sampler).
Common causes
Ordered by hit rate, highest first.
1. Text-only description is too low-bandwidth
A 30-word description maps to billions of possible faces. The model picks one that matches the description loosely, with everything else (jaw shape, eye spacing, mouth width, ear position) randomized per seed.
How to spot it: your prompt has only text and no reference image / LoRA / --cref / IP-Adapter. Identity will drift, period.
2. Different seed each generation
If you don’t fix the seed, every run starts from different random noise. Even with the same prompt, different noise produces different faces. The seed is the single biggest lever for repeatability.
How to spot it: in your tool’s UI, seed is set to “random,” “−1,” or “auto.”
3. Different model / version / sampler
Switching from SDXL to Flux, or from DPM++ 2M Karras to Euler a, changes the face even at the same seed. Most identity drift between sessions traces to a silently updated tool default.
How to spot it: same seed, same prompt, different face — check whether the tool autoupdated the model or sampler since last session.
4. Description focuses on adjectives, not identifying features
brown eyes, dark hair, small nose describes 30% of humans. single mole below right eye, slightly crooked front tooth, narrow chin, gold septum ring describes one. The more unique-identifier-style markers, the more identity locks.
How to spot it: read your description back. Does it identify one person or a category?
5. No character LoRA / embedding when you need recurring use
For a character that appears 50+ times across a project, hand-tweaking each generation is hopeless. You need a LoRA or embedding trained on 15-30 images of that character.
How to spot it: you’re producing a comic, series, or storybook and the character must stay consistent across many images.
Shortest path to fix
Ordered by leverage. Even just doing Steps 1 + 2 fixes 80% of cases.
Step 1: Pick one canonical reference image
Generate (or pick) one image of the character you love. This is your “model sheet.” Use it as the anchor for every downstream generation.
Best practices for the model sheet:
- Front-facing, neutral expression, three-quarter portrait
- Plain background (so the face dominates the embedding signal)
- Distinctive identity markers visible (scar, hairstyle, accessory)
- High resolution (1024×1024+ ideal)
Step 2: Feed the reference into every generation
Each platform has its own mechanism:
# Midjourney v6+
"a [character] sitting on a bench, [scene]" --cref [URL of model sheet] --cw 100
# Stable Diffusion / SDXL via Forge / ComfyUI
- Load IP-Adapter Plus Face
- Connect model sheet to IP-Adapter
- Weight: 0.8-1.0 for identity, lower for soft influence
# Flux dev
- Use Flux Redux or PuLID-Flux
- Reference strength 0.7-0.85
# DALL-E / Bing
- No reference image support — use ChatGPT canvas memory or
text-only with strong identity markers (workaround only)
Step 3: Lock the seed within a session
When generating multiple poses of the same character in one session:
- Midjourney: append
--seed 12345to every prompt - SDXL / Flux UI: set seed to a fixed integer instead of
-1 - ComfyUI: pin the KSampler’s seed and uncheck “randomize”
Note: seed alone won’t lock identity across different prompts — but it stabilizes the noise so the identity drift is smaller per change.
Step 4: Use identifying markers in the prompt
Bad description (too generic):
"28-year-old woman with shoulder-length dark hair, brown eyes, small nose, light freckles"
Good description (identity-locking):
"28-year-old woman, asymmetric chin-length bob with hair tucked behind left ear,
sharp jawline, single small mole below right eye, narrow nose with slight bump,
warm brown eyes with green flecks, three small ear piercings on the upper left ear"
Each unique marker reduces the space of matching faces by an order of magnitude. Three or four markers usually lock identity to within recognizable.
Step 5: Train a LoRA for production use
If the character will appear across many images / scenes / sessions:
# Quick recipe (Kohya / SimpleTuner)
1. Generate 15-30 reference images of the character (different poses, expressions, outfits)
2. Caption each one — include a unique trigger token like "sks_alice"
3. Train a LoRA: 1500-3000 steps, learning rate 1e-4, batch 1
4. Inference: load LoRA, use trigger token in prompt
Online services like Civitai, Replicate, or Astria offer one-click LoRA training if you don’t want local setup.
Step 6: Use ChatGPT/DALL-E recurring character feature
For DALL-E (ChatGPT), generate the character once and then say “use the same character” in subsequent turns within the same chat. The model has limited memory but will preserve broad identity for follow-up images.
Prevention
- Always start a character project by generating + saving one canonical model sheet
- Build a character spec file: model, sampler, seed, reference URL, trigger token (if LoRA), distinguishing markers
- For projects with 30+ images of one character, invest in LoRA training upfront — it pays back fast
- Keep all generated images in a folder organized by character, so you always have backup references
Related
- AI image style inconsistent
- AI image style consistency
- AI image prompt basics
- AI consistent character images
Tags: #Image generation #Debug #Troubleshooting #Consistency