Negative Constraints Are Too Vague

"Do not be generic" tells the model what not to do without telling it what to do, so it dodges the word and keeps the behavior. Pair every 'do not' with a concrete 'do'.

Published: May 20, 2026 Updated: Jun 21, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

You added “do not be generic” because the previous output was generic. The new output avoided the literal word “generic” and used “broadly applicable” instead. It is still generic. You added “do not write a wall of text” and the model produced bullet points containing the same wall of text broken into 8 lines.

Fastest fix: delete the bare “do not” and replace it with a measurable “do”. Do not be generic becomes Each paragraph must contain at least 2 specific numbers and 1 named tool. The positive instruction gives the model a target it can hit and check; the negation gave it only a word to dodge.

Negative-only constraints fail for a mechanical reason, not a willpower one. To follow “do not write X”, the model first has to represent X, which raises that token’s activation rather than lowering it. Prompt engineers call this the Pink Elephant Problem (the same Ironic Process Theory that makes “don’t think of a pink elephant” backfire on people). Anthropic’s own prompting guidance states it plainly: tell the model what to do instead of what not to do (Claude prompting best practices, as of June 2026). Empirical 2026 work on negative-constraint failures found the forbidden word reappears in the large majority of violations precisely because naming it primes it, and that a negative instruction suppresses the target token far more weakly than an equivalent positive instruction steers toward a wanted one.

This page walks through why negative-only constraints fail and how to convert each one into positive guidance the model can actually follow and verify.

Which bucket are you in

Symptom in the output	Likely cause	Go to
Banned word gone, same vibe returns	Negative names a value, not a behavior	Step 1
Output is bland with no clear wrong-thing	No positive target to head toward	Step 1, Step 4
Synonym or slight variant of the banned thing	Surface ban, no behavioral anchor	Step 2, Step 4
Bans seem ignored entirely	Banned list too long, treated as noise	Step 3
Model breaks the rule silently and quietly	Ban contradicts the task itself	See “Common causes” #5

Common causes

1. The negative names a value, not a behavior

“Do not be generic” is a value judgment. “Generic” lives in the reviewer’s head, not in any specific output trait the model can measure. The model has nothing to check itself against, so it strips the literal word and ships the same content.

How to spot it: your negative is an adjective or value word (generic, boring, corporate, unprofessional), not a specific token, structure, or pattern.

2. No positive target

If you only forbid, the model has no direction to head toward. It falls back to the next most probable continuation, which for a writing task is usually the next-most-generic thing.

How to spot it: your prompt has do not X with no paired do Y.

3. The model can rephrase to dodge

Ban a word and the model uses a synonym. Ban a pattern and it uses a slight variant. A surface ban with no behavioral anchor gets gamed.

How to spot it: the forbidden phrases are gone, but the underlying behavior is unchanged.

4. Banned list is too long

A 20-item “do not” list dilutes attention; the model treats it as ambient noise and follows almost none of it. Short, specific lists get honored; long lists do not.

How to spot it: your “do not” list has 15+ items.

5. Bans contradict other prompt content

Do not use jargon plus a technical task with no glossary forces the model to choose between being inaccurate and breaking the rule. It usually breaks the rule silently because accuracy wins the tiebreak.

How to spot it: the ban is infeasible given the task. Fix the conflict (allow a defined glossary, or relax the ban) rather than restating it louder.

Before you change anything

List every “do not” in your prompt.
For each, write down what behavior you actually want instead (the paired “do”).
Mark each ban as specific (testable) or vague (interpretive).
Decide which bans are worth keeping vs cutting.
Plan a positive anchor (example, schema, or rule) for each surviving ban.

Information to collect

Full list of negative constraints in the current prompt.
The output that dodged the bans.
The behavior you actually wanted.
Whether the model rephrased to dodge or genuinely missed the ban.
Model and temperature (a high temperature widens the dodge space).

Shortest path to fix

Step 1: Pair every “do not” with a “do”

Bad:  Do not be generic.
Good: Each paragraph includes at least 2 specific numbers and 1 named tool.

Bad:  Do not write a wall of text.
Good: Max 4 sentences, each under 20 words. Use a numbered list.

State the “do” and you can usually drop the “do not” entirely, since it follows from the positive rule. If you keep both, lead with the “do”.

Step 2: Convert vague negatives to specific bans

Bad:  Do not use corporate language.
Good: Banned words: leverage, utilize, synergize, going forward,
      at the end of the day, holistic, robust, scalable.

Specific token bans are enforceable because the model can scan for them. Vague vibe bans are not.

Step 3: Limit the banned list

Cap it at 5-10 highly specific items. Anything longer dilutes attention. If you have more, split into separate prompts or run a multi-pass workflow (draft, then a dedicated edit pass against the full list).

Step 4: Provide a positive example

The strongest replacement for “do not” is to show the kind of output you do want, contrasted with what you do not:

Like this:
We deployed Stripe Connect to handle marketplace payouts. Daily volume:
$42k, settlement time T+2 days. Replaced our previous PayPal integration,
which had 18% chargeback-handling friction.

Not like this:
We adopted a robust payment solution to improve our payout workflow.

The contrast makes both directions concrete. A single before/after pair routinely outperforms a paragraph of rules.

Step 5: Add a self-check at the end

After writing, check:
- Did you use any banned word? List them.
- Did each paragraph include at least 2 specific numbers or named tools?
- If any check fails, rewrite that part before returning the answer.

This catches dodges by forcing the model to audit its own draft against the positive criteria, not the negation.

Step 6: Lock recurring bans into the project / system prompt

If “no corporate jargon” is a permanent rule for all your work, move the list into a project instruction or system prompt (Claude Projects, a ChatGPT custom GPT, or a Cursor project rule). It then stops eating space in every message and survives recency drift, where rules near the top of a long prompt lose force over a long conversation.

How to confirm the fix

The new output contains none of the banned tokens (Ctrl+F each one).
The new output also contains the specific positives you required (the numbers, the named tools, the length cap).
Re-running with a deliberately weak input does not regress to the old behavior.
A teammate reading the output cannot tell which bans you used; they just see good output.

If it still fails

The “do” pairing is too soft. Make the positive measurable (a count, a length, a named element), not another adjective.
Add 1-2 more pass/fail examples covering the exact dodge you saw.
Switch to structured output (JSON or a schema). Structure makes some bans unnecessary, since a field that demands a number cannot hold filler.
For complex bans, ask the model to plan its output first, then write it. You can catch the dodge at the plan stage before any prose exists.

FAQ

Why does the model ignore “do not” but follow “do”? To obey “do not X”, the model must first represent X, which raises that token’s probability instead of lowering it (the Pink Elephant Problem). A positive instruction points at the target directly, so there is nothing to suppress.

Should I ever use negative constraints at all? Yes, for surgical, literal bans where the unwanted thing is one specific token or pattern, for example Banned words: leverage, utilize. Keep them short and pair them with a positive rule. The failure mode is vague negatives (do not be generic), not all negatives.

Can I just stack more “do nots” to be safe? No. A long ban list dilutes attention and the model treats it as noise. Cap it near 5-10 items and lean on positive examples for everything interpretive.

The banned word is gone but the writing is still bland. Why? Removing a word is not the same as adding substance. You banned a symptom without specifying the cure. Add a positive requirement (specific numbers, named tools, a concrete example) so the model has somewhere good to go.

Does this differ between GPT-5.5, Claude, and Gemini? No. This is a property of how language models process negation, not a quirk of one vendor. The convert-to-positive fix works the same across GPT-5.5, Claude Opus 4.7 / Sonnet 4.6, and Gemini 3.1 Pro as of June 2026.

Prevention

Default rule: never write a “do not” without a paired “do”.
Maintain a short, stable banned list per workflow (5-10 items max).
Use a before/after example as the positive anchor whenever a ban is interpretive.
Audit prompts quarterly for accumulated “do nots” with no paired positive.
For team workflows, agree on the banned list as a config file, not as ad-hoc message text.
Test bans by writing a deliberately bad output yourself. If the model produces something that looks like your bad example, your bans missed.

Tags: #Troubleshooting #Prompt #Prompt quality #Prompt engineering