What this covers
Midjourney is the most opinionated image generator on the market - which is great when you want a “look” and frustrating when you want control. This guide is the cheat sheet that gets a brand-new user from staring at a blank prompt box to producing usable images in under 30 minutes: the prompt formula, the 5 parameters that actually matter, and the iteration loop pros use.
Key tools and concepts:
- Midjourney - A leading AI image generator known for stylized, high-quality output, controlled through prompts plus suffix parameters like
--ar,--style,--sref.
Who this is for
New Midjourney users on either Discord or the new web app at midjourney.com. No prior image-prompt experience needed; you should have a basic mental model of “subject + style + lighting.”
When to reach for it
When you’ve signed up, paid for the basic plan, and the first 10 prompts produced exactly the cliche stock-photo look the internet warned you about. This guide is the second 10 prompts.
Before you start
- Subscribe to at least the Basic plan ($10/month) - the free trial credits run out fast and don’t include the fastest model versions.
- Use the web app (midjourney.com/explore) rather than Discord if you’re new - the UI is dramatically easier to iterate in.
- Decide your output use case first: thumbnail, hero banner, character ref, mood board. Each wants different aspect ratios and stylization.
- Open Explore and find 3 images you like the look of. You’ll use these as style references via
--sref.
Step by step
- Write the prompt as one sentence: subject + style + lighting + lens. Example:
bookstore at golden hour, warm window light, 35mm photographic, shallow depth of field. - Add aspect ratio with
--ar.--ar 16:9for hero,--ar 9:16for vertical,--ar 1:1for thumbnail. Default is square. - Use
--style rawto dial down stylization when you want photographic output. Without it, Midjourney pushes toward illustrative. - Iterate by changing one variable per re-roll. If results are too saturated, drop
vividand addmuted palette- don’t also change the subject. - Use
--sref [URL]for style transfer from an existing image (yours or from Explore). This is the fastest way to find a consistent look. - Upscale only when you’ve picked a final candidate. Upscaling is credit-expensive and locks in the composition.
The 5 parameters that matter
--ar W:H- aspect ratio. Always set this; defaults are square.--style raw- less Midjourney-house-style, more photographic. Use for product shots and realism.--stylize N(or--s N) - 0-1000. Higher = more artistic license. Default 100. For brand work, try 50-150; for art, 250-500.--sref URL- style reference. Locks aesthetic across a series without rewriting the prompt every time.--cref URL- character reference. Keeps a person/character consistent across images. Use with--cw 100for strong adherence.
Everything else (--chaos, --weird, --tile) is niche; ignore on first read.
Recommended workflow
Explore for inspiration -> grab 1-2 sref URLs -> write subject + style + lighting + lens -> add --ar + --style raw -> generate 4 variations -> pick best, re-roll with 1 variable changed -> upscale final. Budget ~15 prompts per finished image while you’re learning.
FAQ
- Discord vs web app? - Web app, unless you specifically want the community feed energy. Web is faster and history is searchable.
- What’s the difference between
--style rawand no style flag? - Without--style raw, Midjourney applies a house aesthetic (warm, slightly painterly, dramatic light). With it, output is closer to a photograph or whatever you literally described. - Why do my images all look “Midjourney”? - Lower
--stylize, add--style raw, and reference a specific photographer or director in the prompt (e.g. “in the style of Wes Anderson framing”). - How do I get consistent characters? - Use
--crefwith the URL of your best output. Don’t expect perfection in 2026 - face shape sticks, fine details drift. - Is there an API? - Official API is limited; for production pipelines, look at the V6/V7 web interface or third-party wrappers.
- Can it do text in images? - Mediocre - 1-3 words usually work, longer text mangles. For posters with copy, generate the visual then add text in Figma.
Common mistakes
- Stacking 10 style words (“cinematic moody atmospheric dramatic ethereal…”) - the model picks two and ignores the rest.
- Skipping
--ar- default square ruins composition for any hero/banner use. - Trying to over-control with comma-separated lists - write a sentence, the parser handles intent better.
- Upscaling early - locks in a composition before you’ve explored.
- Ignoring
--sref- it’s the single biggest control lever for brand consistency. - Treating every output as final - the workflow is “generate 16, pick 1,” not “generate 1, hope.”
Related
Tags: #Tutorial #Midjourney #Image generation #Getting started