You added “do not be generic” to your prompt because the previous output was generic. The new output avoided the literal word “generic” and used “broadly applicable” instead. It is still generic. You added “do not write a wall of text” — the model produced bullet points containing the same wall of text broken into 8 lines. Negative constraints alone do not work because they describe what not to do but leave the positive target unspecified. The model dodges the literal phrase and reproduces the underlying behavior in different wording. The fix is not more “do not” — it is pairing every “do not” with a concrete “do”.
This page walks through why negative-only constraints fail and how to convert them into actionable positive guidance the model can actually follow.
Common causes
1. The negative names a value, not a behavior
“Do not be generic” is a value judgment. “Generic” is in the eye of the reviewer, not in any specific output trait. The model has nothing to check against.
How to spot it: your negative is an adjective or value word, not a specific token / structure / pattern.
2. No positive target
If you only forbid, the model has no direction to head toward. It picks the next most likely thing, which is often the next-most-generic thing.
How to spot it: your prompt has “do not X” with no paired “do Y”.
3. The model can rephrase to dodge
If you ban a word, the model uses a synonym. If you ban a pattern, it uses a slight variant. Surface bans without behavioral bans get gamed.
How to spot it: forbidden phrases gone, but underlying behavior unchanged.
4. Banned list is too long
20 forbidden words dilute attention. The model treats it as ambient noise. Short, specific lists are followed; long lists are not.
How to spot it: your “do not” list has 15+ items.
5. Bans contradict other prompt content
“Do not use jargon” + technical task with no glossary = the model has to choose between being inaccurate and breaking a rule. It usually breaks the rule silently.
How to spot it: ban is infeasible given the task.
Before you change anything
- List every “do not” in your prompt.
- For each, identify what behavior you actually want instead (the paired “do”).
- Check whether each ban is specific (testable) or vague (interpretive).
- Decide which bans are worth keeping vs cutting.
- Plan a positive anchor (example, schema, or rule) for each surviving ban.
Information to collect
- Full list of negative constraints in current prompt.
- Output that dodged the bans.
- The behavior you actually wanted.
- Whether the model rephrased to dodge or genuinely missed the ban.
- Model and temperature.
Shortest path to fix
Step 1: Pair every “do not” with a “do”
Bad: "Do not be generic."
Good: "Do not be generic.
Do: include at least 2 specific numbers and 1 named tool per paragraph."
Bad: "Do not write a wall of text."
Good: "Do not write a wall of text.
Do: max 4 sentences, each under 20 words. Use a numbered list."
The “do” gives the model a target. The “do not” is now redundant — it follows from the “do”.
Step 2: Convert vague negatives to specific bans
Bad: "Do not use corporate language."
Good: "Banned words: leverage, utilize, synergize, going forward,
at the end of the day, holistic, robust, scalable."
Specific token bans are enforceable. Vague vibe bans are not.
Step 3: Limit the banned list
Cap at 5-10 highly specific items. Anything longer dilutes attention. If you have more, split into separate prompts or use a multi-pass workflow.
Step 4: Provide a positive example
The strongest replacement for “do not”: show the kind of output you do want:
Like this:
"We deployed Stripe Connect to handle marketplace payouts. Daily volume:
$42k, settlement time: T+2 days. Replaced our previous PayPal integration
which had 18% chargeback handling friction."
Not like this:
"We leveraged a robust payment solution to optimize our payout workflow."
The contrast makes both directions clear.
Step 5: Add a self-check at the end
After writing, check:
- Did you use any banned word? List them.
- Did each paragraph include at least 2 specific numbers or named tools?
- If any check fails, rewrite.
This catches dodges by forcing the model to audit itself.
Step 6: Lock recurring bans into project / system prompt
If “no corporate jargon” is a permanent rule for all your work, move the list to a project instruction so it does not eat space in every prompt and so it survives recency drift.
How to confirm the fix
- The new output contains no banned words.
- The new output also contains the specific positives you required.
- Re-prompting with a deliberately bad input does not regress to the old behavior.
- A teammate looking at output cannot tell which bans you used — they just see good output.
If it still fails
- The “do” pairing may be too soft — make the positive measurable.
- Add 1-2 more pass/fail examples.
- Switch to structured output (JSON, schema) — structure makes some bans unnecessary.
- For complex bans, ask the model to plan its output first, then write it; you can catch issues at the plan stage.
Prevention
- Default rule: never write a “do not” without a paired “do”.
- Maintain a short, stable banned list per workflow (5-10 items max).
- Use examples as the positive anchor whenever a ban is interpretive.
- Audit prompts quarterly for accumulated “do nots” with no paired positives.
- For team workflows, agree on the banned list as a config file, not as ad-hoc message text.
- Test bans by writing a deliberately bad output yourself; if the model produces something that looks like your bad output, your bans missed.
Related reading
- Prompt emotional wording
- Output sounds polished but is not actionable
- Ambiguous evaluation criteria
- Missing examples output drift
- No success criteria
Tags: #Troubleshooting #Prompt #Prompt quality #Prompt engineering