AI Image Seed Not Reproducible Across Runs

Q: I pinned everything and the image still differs slightly. Why?

Most likely cudnn nondeterminism on GPU. Enable `torch.use_deterministic_algorithms(True)` and disable `cudnn.benchmark`. If you are running on a different GPU model than the original, you may never get bit-identical results; accept visual equivalence.

Q: Midjourney does not reproduce my image even with `--seed`. What can I do?

Midjourney has a `--seed` parameter (find a job's seed by reacting to it with the envelope emoji, or via the website). As of the V8.1 release it is roughly 99 percent consistent within a single session on the same model version, but it is NOT reliable across sessions, across model versions, or between Fast and Relaxed mode. For reproducibility-critical work, use SDXL or Flux on a version-pinned host instead.

Same seed and prompt, different image? Seed is only one of ~10 variables. Pin model version, sampler, scheduler, steps, CFG, and post-processing — not just the seed.

Published: May 24, 2026 Updated: Jun 17, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

You locked the seed to 42, copied the prompt verbatim, and ran it again next week. The new image is close but not identical: different hair flyaways, a different earring, a slightly shifted horizon. You assumed the seed was a hash of the output, so this feels like a bug. It is not.

Fastest fix: a diffusion image is a deterministic function of seed PLUS sampler PLUS scheduler PLUS model weights PLUS CFG PLUS steps PLUS resolution PLUS every LoRA / ControlNet / refiner pass. Diff the saved metadata of the good image against your new request (exiftool or the PNG parameters chunk), find the one parameter that changed — most often a silently rolled-forward model version or an “auto” sampler — and pin it. Reproducibility is a discipline, not a single toggle.

TL;DR

Save the original image’s full metadata and diff it against the new request. The mismatched line is your culprit.
Pin the model version, not just the model name. Hosted models get rolled forward under you.
Set the sampler and scheduler explicitly. Never let the UI use “auto”.
Strip post-processing (face restore, upscale, refiner) for the test, then re-add one at a time.
For bit-identical local runs, enable torch.use_deterministic_algorithms(True).
Some providers (DALL-E / gpt-image, Midjourney across sessions) cannot guarantee exact reproduction. For those, aim for “visually indistinguishable”, or move to a host that pins versions and seeds.

Which bucket are you in?

Symptom	Most likely cause	Jump to
Was reproducible last month, broke this week, same settings	Provider rolled the base model forward	Cause 1 / Step 2
Differs every run in the SAME session	”auto” sampler, hidden post-processing, or non-deterministic GPU kernels	Cause 2, 5, 7
Differs only across two machines / two GPUs	CUDA kernel + hardware nondeterminism	Cause 5 / Step 5
Tiny detail shifts, params look identical by eye	A one-tick param (CFG, steps, px) or invisible prompt edit	Cause 3, 6 / Step 1
Midjourney or DALL-E specifically	Seed is best-effort on these; not a true determinism control	FAQ

Common causes

Ordered by how often each one is the actual culprit when “same seed” fails.

1. Model version silently rolled forward

Hosted providers (Midjourney, Adobe Firefly, DALL-E / gpt-image, hosted SDXL and Flux endpoints) update the base model on their own cadence. The seed maps to a different noise tensor under new weights, so the image shifts even though your request is byte-identical. You did nothing wrong; the back-end changed under you. As of June 2026 this is the single most common reason a previously-reproducible setup breaks.

How to spot it: check the provider’s release notes / version dropdown. If the version string today differs from when you saved the original, the model changed.

2. Sampler or scheduler swapped

DPM++ 2M Karras and Euler a with the same seed produce different images. Some UIs default to “auto”, which picks a sampler based on resolution or model, so the sampler may have flipped without you noticing. (Flux Dev, for example, is usually run with DPM++ 2M + sgm uniform, while Flux Schnell wants CFG 1 with a simple scheduler. Switching between those changes everything.)

How to spot it: open the saved PNG metadata (or the API request log) and compare the sampler name to today’s request. Mismatched samplers = different image.

3. Steps, CFG, or resolution changed by one tick

A CFG of 7.0 vs 7.5, or 28 vs 30 steps, or 1024x1024 vs 1024x1080, all produce visibly different outputs from the same seed. UI sliders often round invisibly; copy-paste from a numeric field is safer.

How to spot it: diff the full parameter dump byte-for-byte, not by eye.

4. LoRA / ControlNet / refiner pass added or weighted differently

A LoRA at strength 0.6 vs 0.7 makes a noticeable shift. A refiner pass at switch fraction 0.8 vs 0.75 changes high-frequency detail. ControlNet preprocessing (canny edge, depth map) is itself seed-sensitive in some pipelines.

How to spot it: list every adapter, LoRA, embedding, and post-process pass with exact weights. If any is missing or differs, that is the cause.

5. CUDA / hardware nondeterminism

PyTorch’s torch.backends.cudnn.benchmark = True (and the default that cudnn.deterministic = False) lets the GPU pick different kernels for the same operation, producing tiny floating-point differences that compound across diffusion steps. The same seed on the same GPU can produce slightly different outputs; a different GPU model almost certainly will.

How to spot it: run locally with torch.use_deterministic_algorithms(True) and the CUBLAS_WORKSPACE_CONFIG env var set. If reproducibility improves, this was a contributor.

6. Prompt was edited without you realizing

A trailing space, a curly quote vs a straight quote, an em-dash vs a hyphen: tokenizers see these as different tokens. Copy-pasting from rich-text apps (Notion, Google Docs) often silently rewrites punctuation.

How to spot it: diff the prompt strings with diff or cmp. Visual inspection is unreliable.

7. Provider applies hidden post-processing

Some hosted APIs run a face enhancer, upscaler, or safety filter on the way out. These passes are usually not seeded and add randomness, so you get a deterministic core image plus a non-deterministic polish.

How to spot it: request the raw / unprocessed image if the API supports it (enhance: false, upscale: 1). If raw is stable and processed is not, the post-pass is the culprit.

Before you start

Save the original “good” image and its full metadata (PNG parameters chunk, or the API response JSON) before doing any new runs.
Note exactly which provider, model name, and version string produced the original.
Capture the current provider version string today, before you re-run.
Decide your target: bit-identical reproduction is rare on hosted APIs; “visually indistinguishable” is the realistic goal there.

Information to collect

Seed value (integer; confirm it is being passed as an int, not a string "42").
Sampler name and scheduler name, exact string.
Steps, CFG, resolution width and height.
Full prompt string and negative prompt string, byte-for-byte.
Every LoRA, embedding, ControlNet, IP-Adapter with its weight value.
Model file name + hash (for local) or version string (for hosted).
Hardware: GPU model, CUDA version, torch version (for local).

Step-by-step fix

Ordered from least to most invasive.

Step 1: Diff the parameter dumps

Save both metadata sets to text files and diff them:

exiftool -PNG:parameters good.png > good.txt
exiftool -PNG:parameters bad.png > bad.txt
diff good.txt bad.txt

Any line that differs is a suspect. If even one parameter mismatches, fix that first before moving on.

Step 2: Pin the model version explicitly

In hosted APIs, pass the version ID, not just the model name. On Replicate, that is the version hash; on your own deployment, it is the checkpoint file you loaded:

{
  "model": "stable-diffusion-xl-base-1.0",
  "model_version": "39ed52f2-a78c-4ce1-9c1d-aaaaaaaaaa",
  "seed": 42
}

If the provider has no version pin, switch to one that does. Replicate exposes a version hash per model run, and Google Vertex AI Imagen documents seed-based deterministic output; both let you reproduce a specific build. Self-hosting (RunPod, your own GPU) gives you the strongest guarantee.

Step 3: Lock the sampler and scheduler

Explicitly specify both. Do not let “auto” choose:

{
  "sampler": "DPM++ 2M Karras",
  "scheduler": "karras",
  "steps": 30,
  "cfg_scale": 7.0
}

If your UI hides these, switch to the API or a UI that exposes them (ComfyUI, InvokeAI, AUTOMATIC1111 / Forge). In ComfyUI specifically, use a fixed-seed RandomNoise node rather than letting the sampler reseed each run.

Step 4: Strip silent randomness from the pipeline

Disable face restorers, upscalers, and refiners for the reproducibility test. Run the base model only. Once base is stable, re-add each post-process one at a time and verify each is also seedable.

{
  "enhance_face": false,
  "upscale": 1,
  "refiner": null
}

Step 5: Enforce torch determinism (local pipelines)

import torch
import os

# Either ":4096:8" or ":16:8" is valid; the HF diffusers docs use ":16:8".
os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8"
torch.use_deterministic_algorithms(True, warn_only=True)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

generator = torch.Generator(device="cuda").manual_seed(42)

This costs roughly 10-20 percent inference speed but gives bit-identical outputs on the same hardware. Set CUBLAS_WORKSPACE_CONFIG BEFORE CUDA initializes (i.e. before the first torch.cuda call), or it has no effect.

Step 6: Treat the prompt as bytes, not text

Store the prompt in a text file alongside the parameter JSON. Never retype from a screenshot. Diff with cmp:

cmp good_prompt.txt new_prompt.txt

Differences here are usually invisible whitespace or punctuation substitution. See long prompt worse results for prompt-fragility patterns.

Step 7: Snapshot model weights for true long-term reproducibility

For projects where year-over-year reproducibility matters, download the model file (.safetensors) and store its sha256. Hosted versions get deprecated; your local snapshot does not.

sha256sum sd_xl_base_1.0.safetensors > sd_xl_base_1.0.sha256

How to confirm it’s fixed

Generate the same prompt + seed twice in a row. Hash both PNG pixel buffers (shasum -a 256 a.png b.png). On a fully pinned local pipeline, the hashes match.
Generate the same prompt + seed across two sessions / two days. Hashes should still match unless the provider rolled the model.
Change one parameter intentionally (e.g. cfg 7.0 to 7.5) and confirm the output now differs. This proves your pipeline is responsive, not stuck on a cached image.
On hosted APIs where pixel-hash equality is impossible, the realistic pass condition is: re-runs are visually indistinguishable and the version string is unchanged.

Long-term prevention

Treat the parameter set as a tuple (model_version, seed, sampler, scheduler, steps, cfg, width, height, prompt_sha, negprompt_sha, adapters_json). Log every tuple alongside the output.
Store the full metadata JSON next to every saved image. Embed it in the PNG if your tool supports it.
Pin the model version in your provider client; subscribe to the provider’s “model deprecated” notices so you know when a version is sunsetting.
For client work that may need re-runs months later, snapshot the model file locally.
Keep a deterministic mode in your local pipeline by default; turn it off only when you need throughput.
Diff prompts with cmp after any copy-paste; never trust visual inspection of long prompts.

Common pitfalls

Assuming “seed locked” is sufficient. Seed is one of about ten parameters.
Copy-pasting prompts through Slack / Notion, which silently rewrites quotes and dashes.
Trusting that the hosted provider has not updated. Check the version string.
Forgetting that an “auto” sampler in the UI is not deterministic across releases.
Comparing two images by eye and declaring them “the same” when a pixel diff says they are not. For true reproducibility you need pixel-hash equality.
Skipping the post-process strip. Face enhancers are the number-one hidden randomizer.

FAQ

Q: I pinned everything and the image still differs slightly. Why?

Most likely cudnn nondeterminism on GPU. Enable torch.use_deterministic_algorithms(True) and disable cudnn.benchmark. If you are running on a different GPU model than the original, you may never get bit-identical results; accept visual equivalence.

Q: Midjourney does not reproduce my image even with --seed. What can I do?

Midjourney has a --seed parameter (find a job’s seed by reacting to it with the envelope emoji, or via the website). As of the V8.1 release it is roughly 99 percent consistent within a single session on the same model version, but it is NOT reliable across sessions, across model versions, or between Fast and Relaxed mode. For reproducibility-critical work, use SDXL or Flux on a version-pinned host instead.

Q: Does DALL-E / gpt-image support a reproducible seed?

Not for exact reproduction. A seed there nudges stylistic and character consistency, but OpenAI does not guarantee deterministic image output, and if your prompt includes a reference image, determinism breaks entirely. Treat it as “similar”, not “identical”.

Q: Should I just save the output instead of trying to reproduce it?

For finished work, yes, save the PNG. Reproducibility matters when you want to iterate (change one knob and see the effect cleanly) or when a client asks for a new variant of a year-old image.

Q: Does increasing steps make output more deterministic?

No. More steps reduce noise but do not make the seed-to-image function more stable. Determinism is a property of the whole pipeline configuration, not the step count.

External references: Hugging Face Diffusers — reproducibility guide and Midjourney Seeds documentation.

Tags: #Troubleshooting #ai-image #seed #reproducibility #diffusion