AI Image Seed Not Reproducible Across Runs

Same seed and prompt produce different images run-to-run — usually a model version drift, sampler change, or hidden pipeline randomness. Pin every variable, not just seed.

You locked the seed to 42, copied the prompt verbatim, and ran it again next week. The new image is close but not identical — different hair flyaways, a different earring, a slightly shifted horizon. You assumed seed alone was a hash of the output, so this feels like a bug. It is not. A diffusion image is a deterministic function of seed PLUS sampler PLUS scheduler PLUS model weights PLUS CFG PLUS steps PLUS resolution PLUS every LoRA / ControlNet / refiner pass. Change any one of those — including silent provider updates to the model file — and you get a different image. Reproducibility is a discipline, not a feature.

Common causes

Ordered by how often each one is the actual culprit when “same seed” fails.

1. Model version silently rolled forward

Hosted providers (Midjourney, Firefly, DALL-E, hosted SDXL) update the base model on their cadence. The seed maps to a different noise tensor under new weights. You did nothing wrong; the back-end shifted under you.

How to spot it: Check the provider’s release notes / version dropdown. If the version string today is different from when you saved the original, the model changed.

2. Sampler or scheduler swapped

DPM++ 2M Karras and Euler a with the same seed produce different images. Some UIs default to “auto” which picks a sampler based on resolution or model — meaning the sampler may have flipped without you noticing.

How to spot it: Open the saved PNG metadata (or the API request log) and compare the sampler name to today’s request. Mismatched samplers = different image.

3. Steps, CFG, or resolution changed by one tick

A CFG of 7.0 vs 7.5, or 28 vs 30 steps, or 1024x1024 vs 1024x1080, all produce visibly different outputs from the same seed. UI sliders often round invisibly; copy-paste from a numeric field is safer.

How to spot it: Diff the full parameter dump byte-for-byte, not by eye.

4. LoRA / ControlNet / refiner pass added or weighted differently

A LoRA at strength 0.6 vs 0.7 makes a noticeable shift. A refiner pass at switch fraction 0.8 vs 0.75 changes high-frequency detail. ControlNet preprocessing (canny edge, depth map) is itself seed-sensitive in some pipelines.

How to spot it: List every adapter, LoRA, embedding, and post-process pass with exact weights. If any is missing or differs, that is the cause.

5. CUDA / hardware nondeterminism

PyTorch’s cudnn.deterministic = False (the default) lets the GPU pick different kernels for the same operation, producing tiny floating-point differences that compound across diffusion steps. The same seed on the same GPU can produce slightly different outputs.

How to spot it: Run locally with torch.use_deterministic_algorithms(True) and CUBLAS_WORKSPACE_CONFIG=:4096:8. If reproducibility improves, this was a contributor.

6. Prompt was edited without you realizing

A trailing space, a curly quote vs straight quote, an em-dash vs hyphen — tokenizers see these as different tokens. Copy-pasting from rich-text apps (Notion, Google Docs) often silently rewrites punctuation.

How to spot it: Diff the prompt strings with diff or cmp. Visual inspection is unreliable.

7. Provider applies hidden post-processing

Some hosted APIs run a face enhancer, upscaler, or safety filter on the way out. These passes are not seeded and add randomness. You get a deterministic core image plus a non-deterministic polish.

How to spot it: Request the raw / unprocessed image if the API supports it (enhance: false, upscale: 1x). If raw is stable and processed is not, the post-pass is the culprit.

Before you start

  • Save the original “good” image and its full metadata (PNG parameters chunk, or the API response JSON) before doing any new runs.
  • Note exactly which provider, model name, and version string produced the original.
  • Capture the current provider version string today, before you re-run.
  • Decide your target: bit-identical reproduction is rare on hosted APIs; “visually indistinguishable” is the realistic goal.

Information to collect

  • Seed value (integer; confirm it is being passed as int, not a string "42").
  • Sampler name and scheduler name, exact string.
  • Steps, CFG, resolution width and height.
  • Full prompt string and negative prompt string, byte-for-byte.
  • Every LoRA, embedding, ControlNet, IP-Adapter with weight value.
  • Model file name + hash (for local) or version string (for hosted).
  • Hardware: GPU model, CUDA version, torch version (for local).

Step-by-step fix

Ordered from least to most invasive.

Step 1: Diff the parameter dumps

Save both metadata sets to text files and diff them:

exiftool -PNG:parameters good.png > good.txt
exiftool -PNG:parameters bad.png > bad.txt
diff good.txt bad.txt

Any line that differs is a suspect. If even one parameter mismatches, fix that first before moving on.

Step 2: Pin the model version explicitly

In hosted APIs, pass the version ID, not just the model name:

{
  "model": "stable-diffusion-xl-base-1.0",
  "model_version": "39ed52f2-a78c-4ce1-9c1d-aaaaaaaaaa",
  "seed": 42
}

If the provider has no version pin, switch to a provider that does (Replicate, RunPod, your own deployment).

Step 3: Lock the sampler and scheduler

Explicitly specify both. Do not let “auto” choose:

{
  "sampler": "DPM++ 2M Karras",
  "scheduler": "karras",
  "steps": 30,
  "cfg_scale": 7.0
}

If your UI hides these, switch to the API or a UI that exposes them (ComfyUI, InvokeAI, AUTOMATIC1111).

Step 4: Strip silent randomness from the pipeline

Disable face restorers, upscalers, and refiners for the reproducibility test. Run the base model only. Once base is stable, re-add each post-process one at a time and verify each is also seedable.

{
  "enhance_face": false,
  "upscale": 1,
  "refiner": null
}

Step 5: Enforce torch determinism (local pipelines)

import torch
import os

os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8"
torch.use_deterministic_algorithms(True, warn_only=True)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

generator = torch.Generator(device="cuda").manual_seed(42)

This costs 10-20 percent inference speed but gives bit-identical outputs on the same hardware.

Step 6: Treat the prompt as bytes, not text

Store the prompt in a text file alongside the parameter JSON. Never retype from a screenshot. Diff with cmp:

cmp good_prompt.txt new_prompt.txt

Differences here are usually invisible whitespace or punctuation substitution. See long prompt worse results for prompt-fragility patterns.

Step 7: Snapshot model weights for true long-term reproducibility

For projects where year-over-year reproducibility matters, download the model file (.safetensors) and store its sha256. Hosted versions disappear; your local snapshot does not.

sha256sum sd_xl_base_1.0.safetensors > sd_xl_base_1.0.sha256

Verify

  • Generate the same prompt + seed twice in a row. Hash both PNG pixel buffers (shasum -a 256 a.png b.png). On a fully pinned pipeline, hashes match.
  • Generate the same prompt + seed across two sessions / two days. Hashes should still match unless the provider rolled the model.
  • Change one parameter intentionally (e.g. cfg 7.0 → 7.5) and confirm the output now differs. Proves your pipeline is responsive, not stuck on a cached image.

Long-term prevention

  • Treat the parameter set as a tuple (model_version, seed, sampler, scheduler, steps, cfg, width, height, prompt_sha, negprompt_sha, adapters_json). Log every tuple alongside the output.
  • Store the full metadata JSON next to every saved image. Embed it in the PNG if your tool supports it.
  • Pin the model version in your provider client; subscribe to the provider’s “model deprecated” mailing list so you know when a version is sunsetting.
  • For client work that may need re-runs months later, snapshot the model file locally.
  • Use a deterministic mode in your local pipeline by default; turn it off only when you need throughput.
  • Diff prompts with cmp after any copy-paste; never trust visual inspection of long prompts.

Common pitfalls

  • Assuming “seed locked” is sufficient. Seed is one of about ten parameters.
  • Copy-pasting prompts through Slack / Notion, which silently rewrites quotes and dashes.
  • Trusting that the hosted provider has not updated. Check the version string.
  • Forgetting that an “auto” sampler in the UI is not deterministic across releases.
  • Comparing two images by eye and declaring them “the same” when pixel diff says they are not — for true reproducibility you need pixel hash equality.
  • Skipping the post-process strip — face enhancers are the #1 hidden randomizer.

FAQ

Q: I pinned everything and the image still differs slightly. Why?

Likely cudnn nondeterminism on GPU. Enable torch.use_deterministic_algorithms(True). If running on a different GPU model than the original, you may never get bit-identical results — accept visual equivalence.

Q: Midjourney does not let me set seed. What can I do?

Midjourney has a --seed parameter, but the model is updated frequently and --seed is best-effort. For reproducibility-critical work use SDXL or Flux on a pinned host.

Q: Should I just save the output instead of trying to reproduce?

Yes — for finished work, save the PNG. Reproducibility matters when you want to iterate (change one knob and see the effect cleanly) or when a client asks for a new variant of a year-old image.

Q: Does increasing steps make output more deterministic?

No. More steps reduce noise but do not make the seed-to-image function more stable. Determinism is a property of the whole pipeline configuration, not the step count.

Tags: #Troubleshooting #ai-image #seed #reproducibility #diffusion