You prompt storefront with a sign that says "OPEN" and the model produces a sign that says OPEM, 0PEN, or worse — letters that look vaguely like Cyrillic, Greek, or invented glyphs. The same prompt with Chinese, Japanese, or Arabic produces even worse results. Most users assume their prompt was wrong, but this is fundamentally a model-capability issue: SD 1.5, SDXL base, and Midjourney v4-v5 simply weren’t trained on aligned image-text pairs for in-image typography. They learned the shape of “letter-like marks,” not actual writing.
The fix is to switch to a text-capable model (Flux, DALL-E 3, Ideogram), shorten the text to 1-2 words in quotes, or use a non-text model for the scene and add text in post via Canva/Figma/Photoshop.
Common causes
Ordered by hit rate, highest first.
1. Pre-Flux model with no text training
SD 1.5, SD 2.1, SDXL base/refiner, Midjourney v4-v5, DALL-E 2, and older models cannot render legible text reliably. Their training labels didn’t ground “OPEN” → the four characters O-P-E-N. They learned “sign-shaped object with letter-like marks.”
How to spot it: you’re on any of those models. Even short, common English words come out garbled.
2. Text request is too long
Even text-capable models break down past 4-6 words. "GRAND OPENING TODAY 50% OFF" will fail almost everywhere. The longer the string, the more chances each character has to drift.
How to spot it: text in prompt is more than 6 words or 30 characters total.
3. Non-Latin scripts on Latin-trained models
Asking for Chinese characters, Japanese kanji/kana, Arabic, Hebrew, Thai, Devanagari on a model that was trained mostly on English signs. Output will be invented glyphs even on Flux.
How to spot it: requested text is non-Latin script. Output looks vaguely like the script but isn’t actual characters.
4. Text not quoted
a sign that says OPEN is ambiguous — OPEN is also a regular English word. a sign that says "OPEN" is more explicit. Without quotes, even text-capable models sometimes interpret the word semantically rather than as glyphs.
How to spot it: prompt has the text content but no quotes around it.
5. Style LoRA distorts typography
Heavy painterly / anime / sketch LoRAs warp letterforms by design. They were trained on stylized artwork where text legibility was not a goal.
How to spot it: same prompt without the LoRA produces cleaner (though still imperfect) text.
Shortest path to fix
Step 1: Switch to a text-capable model
The currently reliable models for in-image text:
Flux 1 [dev] or [pro] - best open-weight model for English text
DALL-E 3 (via ChatGPT or Bing Image Creator) - very strong on short English
Ideogram 2.0 - purpose-built for text, handles short paragraphs well
Midjourney v6 - much better than v5, still imperfect on 4+ words
GPT Image (gpt-image-1) - good for short banner-style text
For text-heavy work, Ideogram and DALL-E 3 are the safest bets. For artistic style + short text, Flux dev wins.
Step 2: Keep text short and quoted
Rewrite the prompt:
# Bad
"storefront with a giant sign that says GRAND OPENING TODAY"
# Better
'storefront with a sign that says "OPEN"'
# Best for long copy
'storefront with a sign that says "OPEN" in bold letters, smaller sign below'
(generate the smaller sign as a separate gen or in post)
Always wrap the literal text in double quotes. Limit to 1-3 words. If you need more, split across multiple visual elements.
Step 3: Non-Latin scripts — generate scene, add text in post
If you need Chinese, Japanese, Arabic, etc., do not ask the image model to render the script:
- Generate the scene with a blank or placeholder sign (
sign that says "SIGN"orblank rectangular sign) - Open in Figma, Canva, or Photoshop
- Add the real text on top using a real font in that script
- Match perspective with Photoshop’s
Edit > Transform > Perspectiveor Figma’s text-on-path - Match lighting with a tonal adjustment layer
Fonts that match in-image perspective convincingly: any clean sans-serif (Noto Sans for any script, PingFang for Chinese, Hiragino for Japanese, Cairo for Arabic).
Step 4: Use a Midjourney + post combo
For artistic illustration where you want MJ’s aesthetic but legible text:
1. Generate the scene in Midjourney with a deliberately blank sign
2. Upscale and export
3. Add text in Canva using a hand-drawn or sketched font
4. Slightly distort and offset to match the painterly background
This nets you MJ’s artistry with real typography.
Step 5: For Flux, weight the text prompt
Flux responds well to weight syntax on text. Try:
'a vintage diner sign with the text "EAT" in bold red letters,
clear legible letterforms, sharp typography, no gibberish, no fake letters,
1950s neon sign style'
Adding clear legible letterforms and no gibberish, no fake letters as direct anti-cues raises the success rate.
Prevention
- Keep a “text-capable” preset in your model rotation (Flux dev local, Ideogram on web)
- For any deliverable with text, always plan a 2-stage workflow: image first, text in post
- Build a Figma / Canva template per format (storefront, poster, banner) with adjustable text layers
- Never use SD 1.5 or SDXL base when text is critical — switch models even for one shot
- For non-Latin scripts, assume post-production from day one and budget time for it