You wrote 12 careful sentences setting up rules, then ended with “oh, and keep it casual.” The output ignored 10 of the 12 rules and went full casual. The careful rules said “use formal English, cite sources, return JSON”. The casual aside was a throwaway. The model picked the throwaway because it was last. Recency bias in language models is real and load-bearing: tokens at the end of the prompt receive disproportionate attention, especially when you have not flagged the earlier ones as hard constraints. Anything you write last reads as “and the final word is…”
This page walks through why the last sentence wins and how to keep the rules at the top from being overwritten by an off-hand closing line.
Common causes
1. The aside is the last token before the deliverable
Whatever is closest to “now write the answer” gets weighted most. If you ended with “btw, keep it short”, short wins over every earlier rule.
How to spot it: remove the last sentence and the output reverts to obeying earlier rules.
2. Earlier rules were never flagged as hard
If your rules were prose (“It would be good if you used JSON”) instead of imperatives (“MUST return valid JSON”), the model treats them as preferences. A late “just give me a quick summary” overrides preferences easily.
How to spot it: your rules use “should”, “would be nice”, “try to” — soft modal verbs.
3. Long prompt diffuses early attention
In a 2000-word prompt, the first 200 words feel distant by the time the model is generating. The last 200 are local. The model attends locally.
How to spot it: same rules work in a short prompt, fail in a long one.
4. Conversation history adds recency
In a chat thread, your latest message is the closest token to the response. Earlier turns (even with strict rules) get out-weighted by the latest casual message.
How to spot it: rules from turn 1 are followed in turn 2, ignored by turn 8.
5. No re-anchor at the end of the prompt
Even careful prompts often end with “Now write the answer.” There is no last reminder of the rules, so recency bias works against the rules instead of for them.
How to spot it: your final paragraph is the task, not the rules.
Before you change anything
- Read your prompt from bottom up. The last 3 sentences are what the model attends to most.
- Identify which sentences are hard rules vs preferences.
- Try deleting the last sentence and re-running. If output improves, you found the override.
- Try reordering the prompt with the same content. Order alone often fixes it.
- For chat threads, check whether your latest message buried the rules from earlier turns.
Information to collect
- Full prompt text in order.
- The output that ignored the rules.
- Output you get if you delete the last sentence.
- Output you get if you move the rules to the end.
- For chat: full conversation history.
Shortest path to fix
Step 1: End the prompt with the hard rules, not the deliverable
Sandwich pattern:
[Top]
NON-NEGOTIABLE RULES:
- Return valid JSON
- Field "summary" must be under 50 words
- Cite source for each claim
[Middle: context, examples, etc.]
[Bottom — restate before the deliverable]
Reminder of hard rules: valid JSON, summary < 50 words, sources cited.
Now produce the output.
The model attends most to the end, so put the rules there too.
Step 2: Convert soft modal verbs to MUST / DO NOT
Bad: "It would be good to keep this short."
Good: "MUST be under 100 words. DO NOT exceed."
Bad: "Try to use JSON."
Good: "Return only valid JSON. Any prose outside the JSON block is a violation."
The model parses MUST / DO NOT as binding more reliably than modal verbs.
Step 3: Drop late asides into structured slots
If you have a “btw” thought, do not append it. Edit it into the right structural slot:
Bad: [12 sentences of rules] ... oh and keep it casual.
Good: [Top]
VOICE: casual (contractions allowed, second-person preferred)
[12 sentences of rules]
[Bottom: restate hard rules + voice]
Step 4: For chat threads, re-anchor every few turns
In a long chat, repaste hard rules in your latest message:
(Continuing the task from turn 1. Rules: <restate the 3 hardest>.)
Now do: <new request>.
Or move the rules to a system prompt / project instruction where they survive recency drift entirely.
Step 5: Audit the last 3 sentences
Before sending any prompt, read the last 3 sentences. If they would mislead a stranger about what you want, rewrite them. The last 3 sentences are doing 50% of the steering.
Step 6: For tools that allow it, end with a schema
A formal schema at the end is the strongest possible recency anchor:
Output schema (return only this):
{
"summary": "<string, max 50 words>",
"sources": ["<url>", ...]
}
The schema dominates because it is concrete and last.
How to confirm the fix
- Removing the last sentence does not change output (rules at top survived).
- Adding a casual aside at the end does not flip the output to casual (hard rules at bottom held).
- In a chat thread, turn 1 rules survive into turn 10.
- A stranger reading the last 3 sentences predicts the same output you want.
If it still fails
- Move rules into a system prompt or project instruction.
- Use structured output (JSON schema, tool use) — recency bias matters less when the format is fixed.
- Try a stronger model — some are more robust to recency.
- Shorten the prompt — long prompts amplify recency bias.
Prevention
- Always end prompts with the hard rules restated, not the deliverable alone.
- Default to MUST / DO NOT / MUST NOT for hard rules. Reserve “should” for genuine preferences.
- Use a sandwich template: rules top, rules bottom, deliverable last but referencing rules.
- For chat work, put hard rules in system prompt / project instructions, not in user message.
- Before sending, scroll to the bottom of your prompt and ask “does this read as the final word?”
- Watch for “btw” and “oh, and” — these are signals to refactor, not append.
Related reading
- Conflicting instructions weaken output
- Long prompt degrades output
- AI ignores important constraint
- Prompt lacks context hierarchy
- Prompt misused system vs user
Tags: #Troubleshooting #Prompt #Prompt quality #Prompt engineering