You locked the seed to 42, copied the prompt verbatim, and ran it again next week. The new image is close but not identical — different hair flyaways, a different earring, a slightly shifted horizon. You assumed seed alone was a hash of the output, so this feels like a bug. It is not. A diffusion image is a deterministic function of seed PLUS sampler PLUS scheduler PLUS model weights PLUS CFG PLUS steps PLUS resolution PLUS every LoRA / ControlNet / refiner pass. Change any one of those — including silent provider updates to the model file — and you get a different image. Reproducibility is a discipline, not a feature.
Common causes
Ordered by how often each one is the actual culprit when “same seed” fails.
1. Model version silently rolled forward
Hosted providers (Midjourney, Firefly, DALL-E, hosted SDXL) update the base model on their cadence. The seed maps to a different noise tensor under new weights. You did nothing wrong; the back-end shifted under you.
How to spot it: Check the provider’s release notes / version dropdown. If the version string today is different from when you saved the original, the model changed.
2. Sampler or scheduler swapped
DPM++ 2M Karras and Euler a with the same seed produce different images. Some UIs default to “auto” which picks a sampler based on resolution or model — meaning the sampler may have flipped without you noticing.
How to spot it: Open the saved PNG metadata (or the API request log) and compare the sampler name to today’s request. Mismatched samplers = different image.
3. Steps, CFG, or resolution changed by one tick
A CFG of 7.0 vs 7.5, or 28 vs 30 steps, or 1024x1024 vs 1024x1080, all produce visibly different outputs from the same seed. UI sliders often round invisibly; copy-paste from a numeric field is safer.
How to spot it: Diff the full parameter dump byte-for-byte, not by eye.
4. LoRA / ControlNet / refiner pass added or weighted differently
A LoRA at strength 0.6 vs 0.7 makes a noticeable shift. A refiner pass at switch fraction 0.8 vs 0.75 changes high-frequency detail. ControlNet preprocessing (canny edge, depth map) is itself seed-sensitive in some pipelines.
How to spot it: List every adapter, LoRA, embedding, and post-process pass with exact weights. If any is missing or differs, that is the cause.
5. CUDA / hardware nondeterminism
PyTorch’s cudnn.deterministic = False (the default) lets the GPU pick different kernels for the same operation, producing tiny floating-point differences that compound across diffusion steps. The same seed on the same GPU can produce slightly different outputs.
How to spot it: Run locally with torch.use_deterministic_algorithms(True) and CUBLAS_WORKSPACE_CONFIG=:4096:8. If reproducibility improves, this was a contributor.
6. Prompt was edited without you realizing
A trailing space, a curly quote vs straight quote, an em-dash vs hyphen — tokenizers see these as different tokens. Copy-pasting from rich-text apps (Notion, Google Docs) often silently rewrites punctuation.
How to spot it: Diff the prompt strings with diff or cmp. Visual inspection is unreliable.
7. Provider applies hidden post-processing
Some hosted APIs run a face enhancer, upscaler, or safety filter on the way out. These passes are not seeded and add randomness. You get a deterministic core image plus a non-deterministic polish.
How to spot it: Request the raw / unprocessed image if the API supports it (enhance: false, upscale: 1x). If raw is stable and processed is not, the post-pass is the culprit.
Before you start
- Save the original “good” image and its full metadata (PNG
parameterschunk, or the API response JSON) before doing any new runs. - Note exactly which provider, model name, and version string produced the original.
- Capture the current provider version string today, before you re-run.
- Decide your target: bit-identical reproduction is rare on hosted APIs; “visually indistinguishable” is the realistic goal.
Information to collect
- Seed value (integer; confirm it is being passed as int, not a string
"42"). - Sampler name and scheduler name, exact string.
- Steps, CFG, resolution width and height.
- Full prompt string and negative prompt string, byte-for-byte.
- Every LoRA, embedding, ControlNet, IP-Adapter with weight value.
- Model file name + hash (for local) or version string (for hosted).
- Hardware: GPU model, CUDA version, torch version (for local).
Step-by-step fix
Ordered from least to most invasive.
Step 1: Diff the parameter dumps
Save both metadata sets to text files and diff them:
exiftool -PNG:parameters good.png > good.txt
exiftool -PNG:parameters bad.png > bad.txt
diff good.txt bad.txt
Any line that differs is a suspect. If even one parameter mismatches, fix that first before moving on.
Step 2: Pin the model version explicitly
In hosted APIs, pass the version ID, not just the model name:
{
"model": "stable-diffusion-xl-base-1.0",
"model_version": "39ed52f2-a78c-4ce1-9c1d-aaaaaaaaaa",
"seed": 42
}
If the provider has no version pin, switch to a provider that does (Replicate, RunPod, your own deployment).
Step 3: Lock the sampler and scheduler
Explicitly specify both. Do not let “auto” choose:
{
"sampler": "DPM++ 2M Karras",
"scheduler": "karras",
"steps": 30,
"cfg_scale": 7.0
}
If your UI hides these, switch to the API or a UI that exposes them (ComfyUI, InvokeAI, AUTOMATIC1111).
Step 4: Strip silent randomness from the pipeline
Disable face restorers, upscalers, and refiners for the reproducibility test. Run the base model only. Once base is stable, re-add each post-process one at a time and verify each is also seedable.
{
"enhance_face": false,
"upscale": 1,
"refiner": null
}
Step 5: Enforce torch determinism (local pipelines)
import torch
import os
os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8"
torch.use_deterministic_algorithms(True, warn_only=True)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
generator = torch.Generator(device="cuda").manual_seed(42)
This costs 10-20 percent inference speed but gives bit-identical outputs on the same hardware.
Step 6: Treat the prompt as bytes, not text
Store the prompt in a text file alongside the parameter JSON. Never retype from a screenshot. Diff with cmp:
cmp good_prompt.txt new_prompt.txt
Differences here are usually invisible whitespace or punctuation substitution. See long prompt worse results for prompt-fragility patterns.
Step 7: Snapshot model weights for true long-term reproducibility
For projects where year-over-year reproducibility matters, download the model file (.safetensors) and store its sha256. Hosted versions disappear; your local snapshot does not.
sha256sum sd_xl_base_1.0.safetensors > sd_xl_base_1.0.sha256
Verify
- Generate the same prompt + seed twice in a row. Hash both PNG pixel buffers (
shasum -a 256 a.png b.png). On a fully pinned pipeline, hashes match. - Generate the same prompt + seed across two sessions / two days. Hashes should still match unless the provider rolled the model.
- Change one parameter intentionally (e.g. cfg 7.0 → 7.5) and confirm the output now differs. Proves your pipeline is responsive, not stuck on a cached image.
Long-term prevention
- Treat the parameter set as a tuple
(model_version, seed, sampler, scheduler, steps, cfg, width, height, prompt_sha, negprompt_sha, adapters_json). Log every tuple alongside the output. - Store the full metadata JSON next to every saved image. Embed it in the PNG if your tool supports it.
- Pin the model version in your provider client; subscribe to the provider’s “model deprecated” mailing list so you know when a version is sunsetting.
- For client work that may need re-runs months later, snapshot the model file locally.
- Use a deterministic mode in your local pipeline by default; turn it off only when you need throughput.
- Diff prompts with
cmpafter any copy-paste; never trust visual inspection of long prompts.
Common pitfalls
- Assuming “seed locked” is sufficient. Seed is one of about ten parameters.
- Copy-pasting prompts through Slack / Notion, which silently rewrites quotes and dashes.
- Trusting that the hosted provider has not updated. Check the version string.
- Forgetting that an “auto” sampler in the UI is not deterministic across releases.
- Comparing two images by eye and declaring them “the same” when pixel diff says they are not — for true reproducibility you need pixel hash equality.
- Skipping the post-process strip — face enhancers are the #1 hidden randomizer.
FAQ
Q: I pinned everything and the image still differs slightly. Why?
Likely cudnn nondeterminism on GPU. Enable torch.use_deterministic_algorithms(True). If running on a different GPU model than the original, you may never get bit-identical results — accept visual equivalence.
Q: Midjourney does not let me set seed. What can I do?
Midjourney has a --seed parameter, but the model is updated frequently and --seed is best-effort. For reproducibility-critical work use SDXL or Flux on a pinned host.
Q: Should I just save the output instead of trying to reproduce?
Yes — for finished work, save the PNG. Reproducibility matters when you want to iterate (change one knob and see the effect cleanly) or when a client asks for a new variant of a year-old image.
Q: Does increasing steps make output more deterministic?
No. More steps reduce noise but do not make the seed-to-image function more stable. Determinism is a property of the whole pipeline configuration, not the step count.
Related
- AI image not matching prompt
- AI image batch style drift
- AI image character consistency
- Long prompt worse results
Tags: #Troubleshooting #ai-image #seed #reproducibility #diffusion