You wanted output that is “warm, conversational, and personable” — and also “strict JSON with these exact keys”. JSON has fixed structure. Warmth lives in prose. Asking for both is asking the model to satisfy two contradictory specifications. What you get back depends on which constraint wins that day: either flat, lifeless JSON that satisfies the schema; or warm prose with JSON-ish tags around it that breaks your parser. The model is not failing to find the middle ground; there is no middle ground between strict structure and unconstrained prose.
This page walks through why style and format conflict at a fundamental level and how to either pick one or split the work into two passes.
Common causes
1. Warmth requested inside a fixed-key schema
JSON keys are not warm. Enum values are not warm. The only warmth slot is inside a string-valued prose field, and even there warmth is constrained by length and surrounding structure.
How to spot it: you asked for JSON with warm tone.
2. Creative voice + machine-readable goal
Marketing copywriters want creative voice. ETL pipelines want machine-readable structure. Trying to serve both in one output sacrifices the strong constraint of each.
How to spot it: output will be consumed by both humans and a script.
3. No declared winner if they conflict
If you do not say “format wins” or “style wins”, the model averages. Both consumers end up disappointed.
How to spot it: no priority ranking between style and format.
4. Strict JSON for tasks where value is nuance
Asking for “JSON with sentiment as positive/negative/neutral” loses the nuance of “frustrated but understanding”. The schema is the bottleneck, not the model.
How to spot it: the JSON shape cannot express what you actually need.
5. Conflicting cues in the prompt
Same prompt has both “be conversational” (warmth) and “return only valid JSON” (structure). The model has no way to reconcile.
How to spot it: prompt has both directives without resolution.
Before you change anything
- Identify your real consumer: a parser, a human, or both?
- If both, decide which is primary.
- Identify which value (warmth or structure) you cannot lose.
- Decide whether one pass can satisfy both or whether you need a pipeline.
- Plan to drop the weaker constraint entirely if necessary.
Information to collect
- Current prompt with both style and format requests.
- Output that fails one constraint or both.
- Downstream consumer (specific parser, specific reader).
- Which constraint is non-negotiable.
- Model and any system prompt.
Shortest path to fix
Step 1: Decide the primary consumer
Consumer = parser (JSON downstream):
Format wins. Drop "warm" from the prompt entirely.
Consumer = human reader:
Style wins. Use markdown (table + commentary) instead of strict JSON.
Consumer = both (display in UI + pipe to analytics):
Two passes. See Step 5.
Step 2: For schema with prose fields, scope warmth to those
{
"category": "billing", // strict enum, no warmth
"priority": "high", // strict enum, no warmth
"summary": "<prose, warm tone, max 50 words>", // warm allowed here
"escalation_needed": true // boolean, no warmth
}
Mark explicitly which fields can be prose and which are mechanical. Warmth lives only in prose fields.
Step 3: For human-only output, choose hybrid format
Markdown table for the structured part:
| Category | Priority | Status |
|---|---|---|
| Billing | High | Needs escalation |
Commentary below the table (where warmth lives):
"The customer is frustrated but understanding — they have been a
loyal user for 3 years. Worth a personal callback rather than a
canned response."
The model can do both halves naturally.
Step 4: Forbid schema bloat
Constraints on the JSON:
- Do not add fields not listed in the schema.
- Do not include explanatory comments inside JSON.
- Do not wrap JSON in prose preamble ("Here is the JSON:").
The model loves to “improve” your schema with explanatory fields. Forbid this.
Step 5: Two-pass workflow for both-consumer cases
Pass 1 (content): Generate warm, nuanced analysis as prose.
Pass 2 (structure): Given the prose from Pass 1, extract into this JSON schema.
Pass 1 captures nuance; Pass 2 enforces structure. Each pass has one job and succeeds.
Step 6: Use structured output mode at the API level
For strict format, use JSON mode, tool use with schemas, or constrained decoding. The model literally cannot produce invalid structure. Then warmth (if any) is constrained to prose fields with no risk of breaking the parser.
How to confirm the fix
- Output parses cleanly downstream.
- Prose fields (if any) have the warmth you wanted.
- No schema bloat or extra fields.
- Running 3 times produces 3 outputs of identical shape.
- The human reader (if applicable) finds the output acceptable.
If it still fails
- The two constraints may be fundamentally incompatible — drop one.
- Use two-pass pipeline (Step 5) for any case where both must hold.
- Switch to API-level structured output if you have not already.
- Lower temperature; format stability improves at lower temperatures.
Prevention
- Treat style and format as separate concerns; rank them before writing the prompt.
- Default: machine consumers get clean schema, no style ask. Human consumers get markdown.
- For mixed-consumer pipelines, split into two passes.
- Reserve “warm JSON” for explicit prose fields; never for keys or enums.
- Audit production prompts for style + format conflict; most should be resolved by removing one.
- When in doubt about consumer, ask: “what receives this output first?”