You pasted 1,200 words: a transcript of a meeting, three slack messages, two paragraphs of background, four bullets of requirements, and a question at the end. Everything is at the same visual weight. The model treated all of it as equally important, weighed by length — and the longest thing was the meeting transcript, so it summarized the transcript and ignored the requirements. The model has no map. Without explicit hierarchy (sections, labels, priority markers), models default to “longest section is the topic” which is almost never what you want.
This page walks through how to add structure to flat prompts and how to mark hard requirements so they survive against background noise.
Common causes
1. No section labels
You wrote a wall of prose. The model has to infer where context ends and task begins. Inference picks the dominant theme, usually background.
How to spot it: no ##, Task:, Background:, or other markers.
2. Multiple input types pasted together
Code + transcript + requirements + screenshots-as-text in one block. The model averages across them and produces a mongrel response.
How to spot it: multiple input types, no separators between them.
3. Hard requirements buried in soft prose
“It would be nice to have X” + “We absolutely need Y” appear at the same visual weight. The model can read “absolutely” but soft modal context drowns it.
How to spot it: hard requirements written as paragraphs, not as a labeled list.
4. Reference material has no provenance
You pasted text from 3 sources without saying which is which. The model treats them as equally authoritative even when one is a draft and one is final.
How to spot it: pasted content with no attribution headers.
5. Sections out of priority order
Long background first, task last. Models attend most to first + last positions. Task in the middle gets the least attention.
How to spot it: structural inversion — background dominates the top.
Before you change anything
- Identify each input type in your current prompt.
- Identify which lines are hard requirements vs background context.
- Decide a priority order: what should the model attend to most?
- Plan section headers and labels.
- Check whether reference material has clear provenance.
Information to collect
- Current prompt with sections you can identify.
- Output that emphasized the wrong content.
- Your actual hard requirements list.
- Each input source and its level of authority.
- Model and any system prompt.
Shortest path to fix
Step 1: Label every section
## Task
<one imperative sentence>
## Hard requirements (non-negotiable)
- Requirement 1
- Requirement 2
## Soft preferences (drop if conflict)
- Preference 1
## Background context
<reference material>
## Output format
<schema>
Visible structure beats inferred structure.
Step 2: Tag mixed inputs
<transcript source="standup-2026-05-21">
... transcript text ...
</transcript>
<requirements source="product-spec-v3">
... requirements ...
</requirements>
<slack source="incident-channel">
... slack messages ...
</slack>
XML-style tags work well. Markdown fences also work. The point is mechanical separation.
Step 3: Mark hard rules with emphasis
## Hard requirements
**MUST**: All outputs must include the customer's order number.
**MUST NOT**: Reveal internal employee names.
**MUST**: Return as JSON.
MUST / MUST NOT are stronger than “should” because the model has internalized RFC 2119 conventions.
Step 4: Order by priority
Top of prompt and bottom of prompt are highest-attention positions. Use them for:
- Top: task + hard requirements
- Middle: background
- Bottom: output format + hard rule restatement
[TOP] Task + Non-negotiables
[MID] Background, transcripts, references
[BOT] Output format + reminder of non-negotiables
Step 5: Summarize long references
If a section is more than 200 words, prepend a 2-line summary:
## Reference: Customer Email
<summary>The customer is angry about billing; key claim is they were
charged twice for the same period.</summary>
<full>
... 400 words of email ...
</full>
The model can use the summary as the “map” and the full text as evidence.
Step 6: Reuse a template
For recurring task types, save the structured template. Filling slots is faster and prevents structural drift.
How to confirm the fix
- A stranger reading the prompt can list task / requirements / background in 30 seconds.
- Output addresses the actual task, not the longest section.
- Hard requirements all appear in the output.
- Provenance is preserved: when the model cites a fact, you can trace it to a source tag.
- Re-running the prompt produces outputs of consistent structure.
If it still fails
- The prompt may still be too long — cut background that does not change the answer.
- Hard requirements may be too many — rank them and drop the bottom ones.
- Move hard requirements to system prompt or project instructions for persistence.
- Try a model with better long-context attention if context cannot be shorter.
Prevention
- Default: every prompt over 200 words uses labeled sections.
- Maintain templates per task type with the section skeleton.
- Tag inputs with provenance whenever you paste from multiple sources.
- Use MUST / MUST NOT for hard rules; reserve “should” for preferences.
- Audit production prompts: any unstructured prompt over 200 words is a risk.
- For team workflows, agree on a section taxonomy so everyone’s prompts read consistently.
Related reading
- Long background hides task
- Prompt lacks source hierarchy
- Long prompt degrades output
- Latest sentence overrides
- AI ignores important constraint
Tags: #Troubleshooting #Prompt #Prompt quality #Prompt engineering