You prompt a woman in a red hat next to a blue car and the model produces a woman in a red hat next to a red car. Or a wooden chair and a metal table produces two wooden objects. This is “attribute bleeding” and it is one of the most common failures in Stable Diffusion, SDXL, and Flux. The cause is that the text encoder mixes token meaning across the whole prompt, so the color “red” bleeds onto every nearby object.
The fix is to either anchor each attribute to its subject with attention weighting, separate subjects into independent attention zones with BREAK syntax or regional prompting, or use ControlNet / segmentation to constrain where each attribute applies.
Common causes
Ordered by hit rate, highest first.
1. Subjects are too close together in the prompt
When you write a red hat and a blue car, the tokens red, hat, blue, car all live in the same 77-token CLIP context. The cross-attention map cannot reliably separate which color binds to which noun, especially in SD 1.5 and SDXL base. Whichever attribute appears first often “wins” across the whole image.
How to spot it: count subjects in your prompt. If you have 2+ subjects with different colors / materials within a single sentence, expect bleed.
2. No attention weighting on the binding word
The model treats red hat and red, hat very differently. Without a weight bump on the binding pair, the model just averages.
How to spot it: prompt has no parentheses or weights around the attribute-subject pairs.
3. Prompt is too long
Past 60-70 tokens, CLIP starts compressing context. Distant tokens lose their attention precision and bleed more aggressively.
How to spot it: paste your prompt into a token counter. If above 60 tokens, that is your problem.
4. SDXL base without refiner
SDXL base alone is more prone to bleeding than SDXL base + refiner. The refiner pass at low denoise sharpens cross-attention and reduces some bleed.
How to spot it: same prompt with refiner enabled vs disabled — if bleeding drops, you needed the refiner.
5. No regional control
In a complex scene with 3+ distinct subjects, no amount of prompt engineering will reliably separate attributes. You need a spatial constraint (regional prompting or ControlNet segmentation).
How to spot it: 3 or more subjects, each with their own colors and materials.
Shortest path to fix
Step 1: Weight the attribute-subject pair
SD 1.5 / SDXL / Forge / ComfyUI all support attention weighting. Wrap each attribute-subject pair in parentheses with a weight:
(red hat:1.3), woman wearing it, standing next to (blue car:1.3)
The colon-weight syntax tells the cross-attention layer to amplify those tokens together as a unit. Weights between 1.1 and 1.4 work best — above 1.5 distorts.
Step 2: Use BREAK to separate attention zones
SDXL and SD 1.5 webUIs (Automatic1111, Forge, ComfyUI with the right node) support BREAK, which clears the attention context between segments:
(red hat:1.2), woman portrait, soft daylight
BREAK
(blue car:1.2), parked behind her, glossy paint
Each BREAK block gets its own 77-token attention window, so red cannot bleed across into car.
Step 3: Try Attend-and-Excite or regional prompting
For tougher cases:
- Attend-and-Excite (research extension): forces attention to attend to specific tokens during sampling
- Regional Prompter extension (Automatic1111): paint masks for “left half” and “right half” of the image and assign different prompts to each region
- ComfyUI regional: use
Conditioning Combine+Conditioning Set Maskto constrain each subprompt to a spatial mask
Example regional prompt in Regional Prompter (split by columns):
ADDCOMM
woman portrait, soft daylight
ADDBASE
(red hat:1.3), woman, urban street
ADDCOL
(blue car:1.3), parked, glossy paint
Step 4: ControlNet segmentation for hard cases
When subjects must occupy specific regions:
- Sketch a quick segmentation map (woman on left = pink, car on right = green) in any paint tool
- Load it into ControlNet with
segpreprocessor orADE20Kstyle - Use ControlNet weight 0.7-1.0
- Prompt each subject in its segmented region using regional prompting on top
This combines spatial control (where each subject sits) with attribute control (what each subject looks like).
Step 5: Shorten and split if all else fails
If the prompt is over 60 tokens, cut it. The cleanest workaround for stubborn bleeding is to generate the scene in two passes:
- Generate the first subject alone with full attention budget
- Outpaint or inpaint the second subject into the scene
This avoids the bleeding problem entirely because each subject is generated in its own context.
Prevention
- Default to attention weights for any prompt with 2+ subjects that have different colors or materials
- Keep core prompts under 60 tokens; move style words to a separate suffix
- Save a Regional Prompter preset for “left subject + right subject” splits
- For product photography with multiple SKUs, render each SKU separately and composite in Photoshop / Affinity
- Test new prompts at 4 seeds; if bleeding shows in 2+, restructure rather than reroll