Your prompt says “respond in JSON matching this schema: {name: string, age: number, tags: string[]}.” 95% of the time, the model returns valid JSON. 3% of the time, it returns “Sure, here’s the JSON you requested: json\n{...}\n”. 1% of the time, it omits the tags field. 1% of the time, it returns age: "thirty" because the schema didn’t enforce types. In production this means your JSON parser fails on 5% of calls, your downstream code crashes, and you scramble to add try-catch everywhere. The model didn’t disobey — it followed an English description of the schema, not an enforced contract.
Modern model APIs offer real schema enforcement (OpenAI structured outputs, Anthropic tool use, Gemini response schema). If you’re still putting the schema in a prompt and praying, you’re leaving free reliability on the table.
Common causes
1. Schema described in English instead of declared
Return JSON: {name, age, tags}.
The model reads this as soft guidance. Sometimes adds wrapper text. Sometimes omits fields. Without API-level enforcement, this is unreliable.
How to spot it: Look for schema as natural-language prose in the prompt with no response_format= or tools= parameter on the API call.
2. Markdown code-block wrapping
Model returns:
Here is the JSON:
```json
\{"name": "Alice"\}
```
Your parser reads the whole string and JSON.parse() fails. Even with response_format={"type":"json_object"} set, model may emit prose if instruction is contradicted.
How to spot it: Output contains backticks or “Here is” / “Sure” / “Of course” prefix.
3. Schema specifies fields but not types
Prompt says {age: number} but model returns "age": "30". Description allows ambiguity. The model thinks “30” is a number-shaped string.
How to spot it: Validate output with a strict JSON Schema validator. Type mismatches mean schema didn’t enforce.
4. Optional fields modeled as required
Schema says {name, email, phone}. User input only had name. Model returns {"name": "Alice", "email": null, "phone": null} or omits the fields. Downstream code expecting strings gets nulls.
How to spot it: Crashes on null field access; or KeyErrors on field omission.
5. Nested objects flattened or expanded
Schema: {user: {name, age}}. Model sometimes returns {name, age} directly, or expands to {user_name, user_age}. Nesting got lost in translation.
How to spot it: Top-level keys don’t match what was specified.
6. Arrays of objects collapse to comma-separated string
Schema: tags: string[]. Model returns "tags": "blue, red, fast". String, not array. Common when input contains comma-separated values.
How to spot it: Type-checking each field against schema reveals string instead of array.
7. Enum fields not honored
Schema: sentiment: "positive" | "neutral" | "negative". Model returns "sentiment": "very positive" or "sentiment": "neg". Enum was a hint, not a constraint.
How to spot it: Sentiment value not in allowed set.
Shortest path to fix
Step 1: Use real structured output, not prompt-described schema
OpenAI (Python):
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
tags: list[str]
resp = client.beta.chat.completions.parse(
model="gpt-5.5",
messages=[...],
response_format=User,
)
user = resp.choices[0].message.parsed # Already a User instance
The API enforces the schema at the token-sampling layer. Invalid tokens are forbidden by construction.
Step 2: Anthropic — use tool definition as schema
tools = [{
"name": "extract_user",
"input_schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"tags": {"type": "array", "items": {"type": "string"}}
},
"required": ["name", "age", "tags"]
}
}]
msg = client.messages.create(
model="claude-opus-4-7",
tools=tools,
tool_choice={"type": "tool", "name": "extract_user"},
messages=[...],
)
data = msg.content[0].input # Validated against schema
Forcing tool use makes the model emit JSON matching the schema.
Step 3: Gemini — pass schema via response_schema
import google.generativeai as genai
resp = model.generate_content(
prompt,
generation_config={
"response_mime_type": "application/json",
"response_schema": user_schema,
},
)
Step 4: When stuck on a model without structured output, validate and retry
def get_json(prompt, max_retries=3):
for i in range(max_retries):
out = call_llm(prompt)
try:
data = json.loads(out)
User.model_validate(data)
return data
except (json.JSONDecodeError, ValidationError) as e:
prompt += f"\n\nPrevious response failed validation: {e}. Return valid JSON only."
raise RuntimeError("Failed after retries")
Pass the validation error back to the model — it usually self-corrects on attempt 2.
Step 5: Use json_object mode as a fallback, not a guarantee
response_format={"type": "json_object"}
This prevents prose wrapping but doesn’t enforce schema. Still validate the parsed object.
Step 6: Pre-extract JSON if model insists on wrapping
import re
def extract_json(text):
# Find first { ... } or [ ... ] block
match = re.search(r'(\{.*\}|\[.*\])', text, re.DOTALL)
if match: return match.group(1)
raise ValueError("No JSON found")
Cheap defense for models that won’t stop adding “Here is your JSON:”.
Step 7: Log schema violations and tune
metrics.increment("schema_violation", tags={"field": field_name, "type": "missing"})
If a particular field is missed 5% of the time, that field’s description in the schema needs work — clarify or add example values.
When this is not on you
Some smaller models flatly cannot follow JSON schemas under any prompting. If you must use a small model, generate JSON-like output and validate / repair downstream — accept some loss rate.
Easy to misdiagnose as
“Bad prompting.” More verbose schema descriptions in the prompt help marginally. The real fix is API-level enforcement. Stop tuning prompts when the answer is “switch to structured outputs.”
Prevention
- Default to structured-output APIs (OpenAI parse, Anthropic tools, Gemini response_schema).
- Define schemas as code (Pydantic, Zod) — one source of truth for client and validator.
- Always run schema validation on parsed JSON, even with structured outputs as a defense-in-depth check.
- Log validation failures and retry with error feedback.
- For models without structured-output support, add
extract_jsonregex as a defensive layer.
FAQ
- Does structured output cost more? Marginal latency overhead, no extra tokens. Almost free reliability.
- What about nested schemas — 5 levels deep? Structured outputs handle nesting up to provider’s depth limit (usually 5-10 levels). Deeper than that — flatten.
Related
- No output format specified
- Model fills in missing details
- Response got cut off because max_tokens was too low
- Conflicting instructions weaken output
- Style vs format conflict
- Output polished but not actionable
- Missing examples cause output drift
- AI hallucinated facts
- Prompt asks for 10 items, model returns 3 and stops
- Latest sentence overrides earlier instructions
Tags: #Prompt engineering #Troubleshooting #llm-output #json #structured-output #schema-validation