Model Follows the Last Sentence and Ignores the Earlier Rules

A casual aside at the end of your prompt can overwrite the careful rules at the top. Anchor the hard rules at both ends so the last line stops winning.

Published: May 20, 2026 Updated: Jun 21, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

You wrote 12 careful sentences setting up rules, then ended with “oh, and keep it casual.” The output ignored 10 of the 12 rules and went full casual. Your rules said “use formal English, cite sources, return JSON.” The casual aside was a throwaway. The model picked the throwaway because it was last.

Fastest fix: restate your 2-3 hardest rules in the last line before the deliverable, and convert soft verbs (“should”, “try to”) into MUST / DO NOT. Then delete any “btw” / “oh, and” line at the very end. In a chat thread, move the rules into ChatGPT Custom Instructions or a Claude Project instead of repeating them in a message. That usually flips obedience back to the rules on the next run.

This works because position matters more than people expect. Modern transformer models attend most strongly to the start and the end of a prompt and least to the middle — a measured “lost-in-the-middle” effect where accuracy on information buried mid-context drops by 30% or more compared to the same information at either edge (Liu et al., Lost in the Middle, 2023). The cause is architectural: rotary position embeddings (RoPE), used by GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro alike, decay attention with distance, so nearby tokens at the end and anchor tokens at the start dominate. Whatever you write last reads as “and the final word is…”

This page walks through why the last sentence wins and how to keep the rules at the top from being overwritten by an off-hand closing line.

Which bucket are you in

Symptom	Likely cause	Go to
Deleting the last sentence fixes it	The aside is the last token before the deliverable	Cause 1, Step 1
Rules use “should” / “try to” / “would be nice”	Rules were never flagged as hard	Cause 2, Step 2
Works in a short prompt, fails in a long one	Long prompt diffuses early attention	Cause 3, Step 6
Turn 1 rules held, turn 8 dropped them	Conversation history adds recency	Cause 4, Step 4
Final paragraph is the task, not the rules	No re-anchor at the end	Cause 5, Step 1

Common causes

1. The aside is the last token before the deliverable

Whatever is closest to “now write the answer” gets weighted most. If you ended with “btw, keep it short,” short wins over every earlier rule because it sits in the high-attention tail of the prompt.

How to spot it: remove the last sentence and the output reverts to obeying earlier rules.

2. Earlier rules were never flagged as hard

If your rules were prose (“It would be good if you used JSON”) instead of imperatives (“MUST return valid JSON”), the model treats them as preferences. A late “just give me a quick summary” overrides preferences easily.

How to spot it: your rules use “should”, “would be nice”, “try to” — soft modal verbs.

3. Long prompt diffuses early attention

In a 2000-word prompt, the first 200 words feel distant by the time the model is generating; the last 200 are local. Rules stranded in the middle take the worst of both — that is the “lost-in-the-middle” zone where retrieval accuracy falls off most.

How to spot it: the same rules work in a short prompt and fail in a long one.

4. Conversation history adds recency

In a chat thread, your latest message is the closest token to the response. Earlier turns (even with strict rules) get out-weighted by the latest casual message. Long sessions make it worse: as the window fills, the model paraphrases or compresses the original system prompt, so the literal rule text drifts.

How to spot it: rules from turn 1 are followed in turn 2, ignored by turn 8.

5. No re-anchor at the end of the prompt

Even careful prompts often end with “Now write the answer.” There is no last reminder of the rules, so recency works against the rules instead of for them.

How to spot it: your final paragraph is the task, not the rules.

Before you change anything

Read your prompt from the bottom up. The last 3 sentences are what the model attends to most.
Identify which sentences are hard rules vs. preferences.
Delete the last sentence and re-run. If output improves, you found the override.
Reorder the prompt with the same content. Order alone often fixes it.
For chat threads, check whether your latest message buried the rules from earlier turns.

Information to collect

Full prompt text in order.
The output that ignored the rules.
The output you get if you delete the last sentence.
The output you get if you move the rules to the end.
For chat: the full conversation history.

Shortest path to fix

Step 1: Anchor the hard rules at BOTH ends, not just the top

Position bias is U-shaped, so a single copy at the top is the worst place — the model attends most to the start and the end. Put the rules at the top, then restate the hardest ones in the last line before the deliverable. This is the sandwich pattern:

[Top]
NON-NEGOTIABLE RULES:
- Return valid JSON
- Field "summary" must be under 50 words
- Cite a source for each claim

[Middle: context, examples, etc.]

[Bottom — restate before the deliverable]
Reminder of hard rules: valid JSON, summary under 50 words, sources cited.

Now produce the output.

The model attends most to the end, so put the rules there too. Do not leave the bare task (“Now write the answer”) as the final line.

Bad:  "It would be good to keep this short."
Good: "MUST be under 100 words. DO NOT exceed."

Bad:  "Try to use JSON."
Good: "Return only valid JSON. Any prose outside the JSON block is a violation."

The model parses MUST / DO NOT as binding more reliably than modal verbs. Capitalizing the keyword adds a small extra signal.

Step 3: Drop late asides into structured slots

If you have a “btw” thought, do not append it. Edit it into the right structural slot:

Bad:  [12 sentences of rules] ... oh and keep it casual.
Good: [Top]
      VOICE: casual (contractions allowed, second-person preferred)
      [12 sentences of rules]
      [Bottom: restate hard rules + voice]

Step 4: For chat threads, move rules out of the message stream

Repasting rules in your latest message works, but it is fragile. The durable fix is to put hard rules where they survive recency drift entirely:

ChatGPT: Settings → Personalization → Custom Instructions (account-wide), or a per-chat Project with its own instructions. Note the Custom Instructions field still has a character cap as of June 2026, so keep it to your 5-8 non-negotiables.
Claude: create a Project and put the rules in the project instructions. Project instructions do not sync across other Projects, so paste them into each Project that needs them.
API: put the rules in the system prompt, not the user turn.

If you must keep rules in the chat, re-anchor every few turns:

(Continuing the task from turn 1. Rules: <restate the 3 hardest>.)
Now do: <new request>.

Step 5: Audit the last 3 sentences before sending

Before sending any prompt, read the last 3 sentences. If they would mislead a stranger about what you want, rewrite them. The last 3 sentences are doing roughly half of the steering.

Step 6: For tools that allow it, end with an output schema

A formal schema at the end is the strongest possible recency anchor:

Output schema (return only this):
{
  "summary": "<string, max 50 words>",
  "sources": ["<url>", ...]
}

The schema dominates because it is concrete and last. Better still, make the format enforced rather than requested:

OpenAI (GPT-5.5) and Gemini 3.1 Pro support native Structured Outputs / JSON Schema via response_format, which constrains the decoder so the format cannot be overridden by any aside.
Claude does not expose a native json_schema response format as of June 2026; use tool use (define a tool whose input schema is your output shape) or prime the response by prefilling the assistant turn with {. Both pin the format far more firmly than a prose request.

How to confirm the fix

Removing the last sentence does not change the output (the top rules survived).
Adding a casual aside at the end does not flip the output to casual (the bottom rules held).
In a chat thread, turn 1 rules survive into turn 10.
A stranger reading only the last 3 sentences predicts the same output you want.

If it still fails

Move the rules into a system prompt, ChatGPT Custom Instructions, or a Claude Project — anywhere outside the user message.
Use enforced structured output (JSON Schema via response_format, or tool use) — position bias barely matters when the format is fixed at decode time.
Shorten the prompt — long prompts widen the lost-in-the-middle zone and amplify the effect.
Split one giant prompt into two calls: one that produces content, one that reformats it under the hard rules.

Prevention

Always end prompts with the hard rules restated, not the deliverable alone.
Default to MUST / DO NOT / MUST NOT for hard rules. Reserve “should” for genuine preferences.
Use the sandwich template: rules at top, rules at bottom, deliverable last but referencing the rules.
For chat work, put hard rules in Custom Instructions / Project instructions, not in the user message.
Before sending, scroll to the bottom and ask: “does this read as the final word?”
Watch for “btw” and “oh, and” — these are signals to refactor, not append.

FAQ

Is this the same as recency bias, or something different? It is one half of a bigger pattern. Models attend most to both the start and the end of a prompt and least to the middle — a U-shaped curve sometimes called “lost in the middle.” Recency (the end winning) is real, but so is primacy (the start winning), which is why the reliable fix is to anchor rules at both ends rather than just moving them to the bottom.

Does putting the rule at the very top fix it? Often not on its own, especially in long prompts. The top is a strong position, but a casual line at the end still sits in the other strong position. Restate the hard rules at the bottom too.

Why do my rules survive in short prompts but break in long ones? Length is the multiplier. In a short prompt every token is near an edge. In a 2000-word prompt the rules can end up in the dead middle, where measured retrieval accuracy drops by 30% or more. Shorten the prompt or duplicate the rules at the end.

Does this happen with every model? Yes, to varying degrees. GPT-5.5, Claude Opus 4.7 / Sonnet 4.6, and Gemini 3.1 Pro all use rotary position embeddings, which decay attention with distance and produce the same edge-weighted pattern. Newer and larger models are somewhat more robust but not immune.

The model still ignores the rule even at the end. Now what? Stop relying on prose. Switch to enforced structured output: native JSON Schema on GPT-5.5 / Gemini 3.1 Pro, or tool use / response prefill on Claude. When the format is constrained at decode time, a trailing aside cannot override it.

Tags: #Troubleshooting #Prompt #Prompt quality #Prompt engineering

Which bucket are you in

Common causes

1. The aside is the last token before the deliverable

2. Earlier rules were never flagged as hard

3. Long prompt diffuses early attention

4. Conversation history adds recency

5. No re-anchor at the end of the prompt

Before you change anything

Information to collect

Shortest path to fix

Step 1: Anchor the hard rules at BOTH ends, not just the top

Step 2: Convert soft modal verbs to MUST / DO NOT

Step 3: Drop late asides into structured slots

Step 4: For chat threads, move rules out of the message stream

Step 5: Audit the last 3 sentences before sending

Step 6: For tools that allow it, end with an output schema

How to confirm the fix

If it still fails

Prevention

FAQ

Related reading

Related Articles

Few-Shot Examples Have Uneven Quality and Drag Output Down

Model Returns Invalid JSON Because Schema Was Described, Not Enforced

Model Invented Fake Citations and URLs

Model Replies in the Wrong Language (How to Lock It)

LLM Response Cut Off Mid-Sentence: max_tokens Too Low (2026 Fix)

Prompt Asks for 10 Items, Model Returns 3 and Stops