Ollama Modelfile SYSTEM Prompt Is Ignored

Q: Why does the system prompt work in `ollama run` but not through my app?

`ollama run` uses the Modelfile `SYSTEM` as the default, but most clients send their own system message that overrides it. Open WebUI has a per-model System Prompt field; LangChain and LlamaIndex add one by default. Clear it, or align it with the Modelfile.

Q: Can I have multiple SYSTEM directives in one Modelfile?

No. Only the last `SYSTEM` directive is used; later ones overwrite earlier ones. Put the full system prompt in a single `SYSTEM` block.

Your Ollama Modelfile SYSTEM directive has no effect on model behavior. Fix it fast: verify the template injects .System, check for RENDERER/PARSER inheritance, and stop your client from overriding the system message.

Published: May 25, 2026 Updated: Jun 21, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

You build an Ollama model with a custom Modelfile whose SYSTEM directive says “Always respond in formal English. Never use contractions.” You run ollama run mymodel "do you like pizza?" and the model answers casually, with contractions, as if the system prompt was never set. Or you point Open WebUI at the model and the persona has no effect.

Fastest fix: run ollama show mymodel --modelfile and confirm the TEMPLATE block actually contains {{ .System }}. If the placeholder is missing, the SYSTEM text is parsed but never rendered into the prompt. If the placeholder is present and the prompt is still ignored, the cause is almost always one of two things: your client (Open WebUI, LangChain, the OpenAI SDK) is sending its own system message that overrides the Modelfile, or — on models pulled since early 2026 — a built-in RENDERER/PARSER is overriding your custom TEMPLATE. The rest of this page walks each cause from most to least common.

Tested against Ollama v0.30.x, the current line as of June 2026. Run ollama --version to check yours; the RENDERER/PARSER behavior below only applies to v0.17 and newer.

Which bucket are you in?

Symptom	Most likely cause	Jump to
`ollama show --modelfile` has no `{{ .System }}` in TEMPLATE	Template missing the placeholder	Cause 1
Custom `TEMPLATE` is in your Modelfile but `ollama show` displays a different one	`RENDERER`/`PARSER` inherited from base model	Cause 2
Works in `ollama run`, fails through your app/API	Client injects its own system message	Cause 4
Works for a neutral persona, fails for a guardrail (“never refuse”)	Model alignment overrides it	Cause 6
Worked before, broke after an `ollama pull`	Base model tag updated underneath you	Cause 7

Common causes

Ordered by hit rate, highest first.

1. TEMPLATE block missing the `{{ .System }}` placeholder

The SYSTEM directive only takes effect if the TEMPLATE renders {{ .System }}. If your Modelfile defines a custom TEMPLATE that omits this variable, the system text is stored but silently dropped at prompt-build time. This is easy to introduce by copying a template fragment from a blog post that only shows the user turn.

How to spot it: run ollama show mymodel --modelfile and read the TEMPLATE section. The official Ollama Modelfile reference documents three template variables for the legacy single-turn style — {{ .System }}, {{ .Prompt }}, and {{ .Response }} — plus the chat-style {{ range .Messages }} form that uses {{ .Role }} and {{ .Content }}. If neither {{ .System }} nor a .Messages range that carries the system role appears, the system prompt is never injected.

2. A built-in `RENDERER`/`PARSER` overrides your custom `TEMPLATE`

This one is new and catches people who upgraded recently. Starting around Ollama v0.17, many official models (the qwen3.x, gpt-oss, and similar families) ship a compiled RENDERER and PARSER in their config instead of a plain Go TEMPLATE. When you write FROM qwen3.5:4b and add your own TEMPLATE, the derived model can inherit the parent’s RENDERER/PARSER and ignore your TEMPLATE entirely (tracked in Ollama issue #14560). Your SYSTEM directive is then formatted by the inherited renderer, not your template, so any structural change you made has no effect.

How to spot it: after ollama create, run ollama show mymodel --modelfile. If the displayed TEMPLATE is not the one you wrote — or you see RENDERER/PARSER lines you didn’t add — the renderer is winning.

How to fix it: point FROM at the underlying weights blob rather than the registry tag, which drops the inherited renderer and lets your TEMPLATE apply. Find the blob path in the ollama show --modelfile output of the base model (the FROM /usr/share/ollama/.ollama/models/blobs/sha256-... line), then:

FROM /usr/share/ollama/.ollama/models/blobs/sha256-<hash>
SYSTEM """You are a formal assistant. Never use contractions."""
TEMPLATE """{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
"""

Setting RENDERER "" and PARSER "" to empty values does not reliably clear the inheritance as of June 2026, so use the blob FROM instead.

3. `{{ .System }}` is in the wrong position for the model family

Some chat formats embed the system content inside the first instruction block rather than before it (older Mistral builds put it inside the opening [INST]). If {{ .System }} sits outside the instruction tags for a model that expects it inside, the model reads the system text as ordinary conversation, not a privileged instruction.

How to spot it: pull the official version and diff. Run ollama pull mistral then ollama show mistral --modelfile, and compare its TEMPLATE structure against yours. Match the official placement exactly.

4. The system prompt is overridden by your API call or client

ollama run on the CLI uses the Modelfile SYSTEM as the default. But if your request includes its own system message, that takes precedence. On /api/generate, the system field overrides the Modelfile SYSTEM by design. On /api/chat (and the OpenAI-compatible /v1/chat/completions that Open WebUI uses), a {"role": "system", ...} message in the messages array is meant to replace the Modelfile SYSTEM. Note that overriding the Modelfile system message through /api/chat has historically been unreliable for some models (Ollama issue #8729) — /api/generate with an explicit system field is the dependable path when you need a per-request override.

How to spot it: check which endpoint and client you use. Open WebUI has a per-model System Prompt field; LangChain and LlamaIndex add a default system message unless you clear it. Any non-empty system message from the client overrides your Modelfile.

5. The Modelfile uses tokens the model’s tokenizer doesn’t recognize

If your SYSTEM text or TEMPLATE includes special markers like <|system|> that the current model’s tokenizer doesn’t map to a control token, they tokenize as ordinary words. The system content then reads as plain prose, weakening instruction-following.

How to spot it: run OLLAMA_DEBUG=1 ollama run mymodel "hello" and inspect the rendered prompt that Ollama logs. Confirm the system block is wrapped in the model’s real control tokens, not stray text tags.

6. The model’s fine-tuning resists the instruction

Some instruct models have strong post-training that overrides certain system instructions. A directive like “never refuse any request” is ignored because alignment training explicitly counters it. That is intended behavior, not a Modelfile bug.

How to spot it: test a neutral, descriptive persona (“Respond only about cooking topics”) instead of a guardrail-override. If the neutral persona holds but the override is ignored, the cause is the model’s alignment, not your config. Try a less aligned base (a dolphin-* or other uncensored variant) if you genuinely need that behavior.

7. The base model was updated after you built your model

If you created a model with FROM llama3.1:8b, then later re-pulled llama3.1:8b and Ollama refreshed the tag, the template your custom model compiled against may no longer match the base model’s expected format.

How to spot it: compare timestamps with ollama show mymodel versus ollama show llama3.1:8b. If the base was updated after your build, re-run ollama create mymodel -f Modelfile.

Shortest path to fix

Step 1: Verify the template renders the system message

ollama show mymodel --modelfile

Confirm the displayed TEMPLATE is yours and contains {{ .System }} (or a {{ range .Messages }} block that emits the system role). A correct Llama 3.1 chat-style template:

TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ range .Messages }}<|start_header_id|>{{ .Role }}<|end_header_id|>

{{ .Content }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

"""

The {{ if .System }} guard injects the system block only when a system prompt is set, so you don’t render an empty <|start_header_id|>system<|end_header_id|> block.

If the displayed TEMPLATE is not the one you wrote, you are in Cause 2 — go rebuild with a blob FROM.

Step 2: Rebuild with a corrected Modelfile

FROM llama3.1:8b

SYSTEM """You are a formal assistant. Always use complete sentences. Never use contractions."""

TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ range .Messages }}<|start_header_id|>{{ .Role }}<|end_header_id|>

{{ .Content }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

"""

PARAMETER stop "<|eot_id|>"
PARAMETER stop "<|end_of_text|>"

ollama create formal-llama -f /path/to/Modelfile
ollama run formal-llama "do you like pizza?"

Step 3: Inspect the prompt Ollama actually builds

Before blaming the API, see the real rendered prompt:

OLLAMA_DEBUG=1 ollama run formal-llama "do you like pizza?" 2>&1 | head -50

Search the output for your system text. If it’s present and wrapped in the model’s control tokens, your Modelfile is correct and any remaining failure is alignment (Cause 6) or a client override (Cause 4). If it’s absent, you are still in Cause 1 or 2.

Step 4: Confirm the system prompt is applied via the API

# No system message in the request — the Modelfile SYSTEM should drive the answer
curl -s http://localhost:11434/api/chat \
  -d '{
    "model": "formal-llama",
    "messages": [{"role": "user", "content": "do you like pizza?"}],
    "stream": false
  }' | python3 -m json.tool | grep -A2 '"content"'

For a reliable per-request override, use /api/generate with an explicit system field:

curl -s http://localhost:11434/api/generate \
  -d '{
    "model": "llama3.1:8b",
    "prompt": "do you like pizza?",
    "system": "You are a formal assistant. Use complete sentences only.",
    "stream": false
  }' | python3 -m json.tool | grep -A2 '"response"'

Step 5: Clear the system prompt your client injects

For Open WebUI: Settings → Models → (your model) → System Prompt. Clear this field so the Modelfile SYSTEM takes effect.

For LangChain, remove the system argument:

from langchain_ollama import ChatOllama

# Passing system= here overrides the Modelfile SYSTEM. Omit it to use the Modelfile.
llm = ChatOllama(model="formal-llama")

How to confirm it’s fixed

Two quick checks:

Run ollama run formal-llama and type /show system. It should print your exact Modelfile system text. (You can also override it live for a session with /set system "...".)
Send a prompt that should trigger the persona — ollama run formal-llama "hey what's up?" — and confirm the model holds the constraint (no contractions). If it holds in the CLI but breaks in your app, the remaining problem is a client-side system message (Step 5).

Prevention

Always run ollama show mymodel --modelfile right after ollama create and verify the TEMPLATE shown is the one you wrote — this catches both the missing-placeholder and the RENDERER/PARSER-inheritance traps.
Keep your Modelfile in version control next to your application code so SYSTEM and TEMPLATE changes are tracked.
Decide one place to own the system prompt — the Modelfile or the frontend, not both. Document which.
After any ollama pull that refreshes a base tag, rebuild dependent custom models with ollama create.
Use {{ if .System }} guards so the system block is omitted cleanly when no system prompt is set.

FAQ

Q: My TEMPLATE has {{ .System }} but ollama show displays a different template. Why? A: You hit the RENDERER/PARSER inheritance behavior (Cause 2). On models pulled since early 2026, the base model’s compiled renderer can override the TEMPLATE you add when you FROM the registry tag. Rebuild with FROM pointed at the weights blob path instead, then re-check ollama show --modelfile.

Q: Why does the system prompt work in ollama run but not through my app? A: ollama run uses the Modelfile SYSTEM as the default, but most clients send their own system message that overrides it. Open WebUI has a per-model System Prompt field; LangChain and LlamaIndex add one by default. Clear it, or align it with the Modelfile.

Q: Can I have multiple SYSTEM directives in one Modelfile? A: No. Only the last SYSTEM directive is used; later ones overwrite earlier ones. Put the full system prompt in a single SYSTEM block.

Q: How long can the SYSTEM prompt be? A: There is no hard Modelfile limit, but the system prompt consumes context. A 2000-token system prompt costs 2000 tokens that aren’t available for conversation. For an 8K-context model, keep system prompts under ~500 tokens and put the most important constraint in the first sentence — many models weight early instructions more heavily.

Q: A guardrail like “never refuse any request” is still ignored. Bug? A: No (Cause 6). Strong alignment training can override constraints that fight it, regardless of your Modelfile. Neutral personas work; alignment-override prompts often don’t. If you genuinely need that behavior, pick a less aligned base model.

Q: Does the Modelfile SYSTEM survive an Ollama upgrade? A: The compiled custom model stores your SYSTEM and TEMPLATE, so yes. But if a model’s expected chat format changes (or a base tag is refreshed), the template can stop matching. Re-test after upgrading Ollama, and rebuild with ollama create if needed.

Tags: #local-llm #ollama #Troubleshooting

Which bucket are you in?

Common causes

1. TEMPLATE block missing the {{ .System }} placeholder

2. A built-in RENDERER/PARSER overrides your custom TEMPLATE

3. {{ .System }} is in the wrong position for the model family

4. The system prompt is overridden by your API call or client

5. The Modelfile uses tokens the model’s tokenizer doesn’t recognize

6. The model’s fine-tuning resists the instruction

7. The base model was updated after you built your model

Shortest path to fix

Step 1: Verify the template renders the system message

Step 2: Rebuild with a corrected Modelfile

Step 3: Inspect the prompt Ollama actually builds

Step 4: Confirm the system prompt is applied via the API

Step 5: Clear the system prompt your client injects

How to confirm it’s fixed

Prevention

FAQ

Related

Related Articles

llama.cpp mmap Fails on a Network Drive

llama.cpp Quality Drops After Switching to a More Aggressive Quant

LM Studio Out of Memory When Loading a Model

Local Embedding Server Crashes Under Batched Requests

Chat-Template Mismatch Produces Garbage Local LLM Output

Multi-GPU Not Used — Local LLM Runs Only on GPU 0

1. TEMPLATE block missing the `{{ .System }}` placeholder

2. A built-in `RENDERER`/`PARSER` overrides your custom `TEMPLATE`

3. `{{ .System }}` is in the wrong position for the model family