AI Image Glasses Reflections Don't Match the Scene

Generated portrait with glasses shows reflections of a window, sky, or environment that doesn't exist in the rendered scene. Why diffusion models hallucinate reflections and how to align them.

You generated a studio portrait. Black backdrop, single softbox key light, subject wearing glasses. The portrait looks great until you look at the lenses: one lens reflects what appears to be a window with daylight; the other reflects an entirely different scene — maybe a bookshelf, maybe trees. Neither matches the actual rendered environment. The two reflections also don’t match each other, which immediately breaks the photo’s believability. Same problem shows up in water surfaces, mirrors, polished metal, and even pupil catchlights.

This is the model failing at a specific physics constraint: a reflection must be a transformation of what’s actually in the scene, applied consistently across paired reflective surfaces. Diffusion models don’t know about light physics — they pattern-match what eyeglass reflections “usually look like” in their training data, which is full of windows and softboxes.

Common causes

1. Model lacks a scene-coherence prior for reflective surfaces

Diffusion models render each region semi-independently. The left lens and the right lens are generated from the same latent but with no explicit constraint that they show the same reflection. The training data has glasses with window reflections, so each lens independently samples “what reflection might appear here.”

How to spot it: Cover the rest of the image and just look at the two lenses. If they look like they came from two different photos, the model is sampling them independently.

2. Background prompt and reflection prompt are contradictory

Your prompt says “studio backdrop, seamless black, dramatic lighting.” But the model has seen far more “person wearing glasses” images with window reflections than with black-studio reflections, so the prior overrides your prompt.

How to spot it: The reflection content is something common in training data (window, sky, trees) regardless of what you asked for in the scene.

3. CFG too low — prompt isn’t enforced strongly on small regions

At low CFG (3-5), the model leans on its prior for small detail regions like lenses. The big shapes obey the prompt; the small reflective surfaces don’t.

How to spot it: At CFG 7-9 the reflections become more controllable but big shapes get harsher.

4. No explicit token for reflection content

You said “wearing glasses” but never described what should be reflected. The model fills the void with the statistical average, which is daylight windows.

How to spot it: Your prompt has no mention of “reflection” or “lens” content at all.

5. Frame-style glasses lenses are too small for coherent rendering

Round John-Lennon frames or thin rectangular lenses give the model 20-40 pixels per lens at 1024x1024. There isn’t enough room to render a coherent reflected scene, so it falls back to “vaguely bright.”

How to spot it: Larger aviator or oversized frames in the same setup work fine; small lenses break.

6. Upscaler hallucinated new content in the lens region

Base generation had plausible neutral reflections. Then your upscaler — especially an AI upscaler with denoise > 0.3 — invented entirely new reflection content during the second pass.

How to spot it: Compare base output to upscaled output. If the lens region content changed substantively, the upscaler did it.

7. Glasses are an inpainted addition

You generated a portrait without glasses, then inpainted glasses on later. Inpainting has even less context about the surrounding scene and is prone to inventing window reflections out of thin air.

How to spot it: Workflow involved a separate glasses-inpaint step.

Shortest path to fix

Step 1: Add explicit reflection-content tokens

# instead of just "wearing glasses"
wearing glasses, lens reflections show studio softbox,
matching catchlights in eyes and lenses, no window reflection

You’re forcing the prior away from “window” by naming what you do want plus what you don’t.

Step 2: Add a strong negative for unwanted reflection content

negative: window reflection in glasses, sky in lenses,
trees reflected in glasses, mismatched lens reflections,
double reflection

These specific negatives bite harder than generic ones.

Step 3: Bump CFG slightly

Move from 5 → 7 for the generation. Higher CFG enforces your reflection-content tokens on small regions. If overall image goes too punchy at 7, keep CFG 7 but reduce denoise on hires fix.

Step 4: Use a controlnet-depth or reference image for the actual scene

Provide a depth map or simple reference image showing your studio setup (e.g., a softbox to camera-left). The model then has a real geometric reference for what should reflect in the lenses.

controlnet: depth, weight 0.7, end_step 0.6
reference: studio_setup_diagram.png

Step 5: Inpaint the lenses with a coherent prompt for both

After base generation, mask both lenses together (not separately!), and inpaint with denoise 0.5 and a prompt focused on coherent reflection:

both lenses showing the same soft studio reflection,
symmetric subtle highlight from above, no window,
no separate scene per lens

Masking both at once and prompting “same reflection on both lenses” is the single highest-leverage step.

Step 6: For final polish, manually paint over rogue reflections

For client work, accept that 30 seconds in any image editor — clone-stamp a clean lens over a rogue lens — gives more reliable results than another generation round. Use the model to get 90% there, then paint.

Step 7: If using an upscaler, lower denoise in the lens region

If your workflow allows region-specific denoise on upscale (e.g., Ultimate SD Upscale with mask), set lens region to denoise 0.15 instead of 0.4. Less invention, more preservation.

When this is not on you

Stable Diffusion 1.5 and SDXL base have essentially no understanding of geometric reflection. Even with perfect prompting you’ll get plausible but non-physical results. If your project demands physically correct reflections (commercial work, product shots), use 3D rendering or composite.

Also, the human eye is pattern-trained to notice broken reflections in eyeglasses specifically (we look at faces with glasses constantly). What feels obviously wrong to you may not register to most viewers — pick your battles.

Easy to misdiagnose as

“Eyes misaligned” — broken catchlights in pupils look similar to broken reflections in lenses, but the fix is different. Pupil catchlights respond to light-direction tokens; lens reflections respond to content tokens. Don’t apply pupil fixes to lens problems or vice versa.

Also similar to “background bleeds onto subject” — both involve unwanted scene content showing where it shouldn’t. The bleeding fix (better masking, lower CFG on edges) doesn’t help here; this is a generation-level prior, not a compositing issue.

Prevention

  • Always include explicit reflection-content tokens when glasses or any reflective surface is in the scene.
  • Default negative-prompt template should include “window reflection in glasses, mismatched lens reflections” for portrait work.
  • Inpaint both lenses together with one mask, never one at a time.
  • For commercial portraits, do a final manual cleanup pass over lenses.
  • Avoid adding glasses via late-stage inpainting; include them in the base prompt.
  • For studio shoots, reference your actual lighting setup via depth controlnet.

FAQ

  • Why do the two lenses sometimes show entirely different scenes? The two lens regions are sampled semi-independently from the same latent, with no explicit cross-region coherence loss. Inpainting both together with one mask gives the model the opportunity to coordinate them.
  • Will this be fixed in future models? Probably yes — newer multimodal models (Imagen 3, GPT-4o image, FLUX) handle reflections noticeably better than SD1.5/SDXL. If reflections are critical, evaluate those instead.

Tags: #ai-image #Troubleshooting #Image generation #reflections #glasses #Portrait