Modelfile SYSTEM Prompt Is Ignored

The SYSTEM directive in an Ollama Modelfile has no effect on the model's behavior. Diagnose template structure, system role injection, and chat API vs. generate API differences.

You create an Ollama model with a custom Modelfile that includes a SYSTEM directive like “Always respond in formal English. Never use contractions.” You run ollama run mymodel "do you like pizza?" and the model responds casually and uses contractions, as if the system prompt was never there. Or you point OpenWebUI at the custom model and the persona defined in the Modelfile has no effect. This is a common Ollama configuration failure that almost always traces to a template structure issue — the {{ .System }} placeholder in the TEMPLATE block is either missing or placed in a position where the model doesn’t interpret it as a system instruction.

Common causes

Ordered by hit rate, highest first.

1. TEMPLATE block missing the {{ .System }} placeholder

The SYSTEM directive only works if the TEMPLATE block includes {{ .System }}. If the Modelfile specifies a TEMPLATE that doesn’t include this placeholder, the system prompt is silently discarded. This is very easy to miss when copying a template from an example.

How to spot it: Run ollama show modelname --modelfile and check the TEMPLATE section for {{ .System }}. If it’s absent, the system prompt will never be injected.

2. {{ .System }} is in the wrong position for the model family

Some models (Mistral v0.1/v0.2) embed the system prompt inside the first [INST] block, not before it. If {{ .System }} is placed outside the instruction tags in the wrong position, the model receives the system content as plain text between turns rather than as a privileged instruction.

How to spot it: Compare your TEMPLATE against the official Ollama library template for the same model. Run ollama show ollama/mistral --modelfile (pull the official version first) and diff against your custom template.

3. Using /api/generate instead of /api/chat

The /api/generate endpoint takes a raw prompt string. System prompts are only supported in the Modelfile template injection or via the system parameter in the API call. If you’re using /api/generate and expecting the Modelfile SYSTEM to apply, it should work — but the system field in the API call overrides the Modelfile SYSTEM directive, which can cause confusion.

The /api/chat endpoint uses structured messages with explicit role fields. The system message in the Modelfile is used as the default for the chat endpoint when no system message is provided in the request.

How to spot it: Check which endpoint your client uses. OpenWebUI and most OpenAI-compatible clients use /v1/chat/completions (equivalent to /api/chat). Direct curl tests often use /api/generate.

4. System prompt overridden by the API call

If your API call (or client software) includes a system message in the request body, it overrides the Modelfile SYSTEM directive entirely. OpenWebUI, for example, has a system prompt field that takes precedence over the Modelfile SYSTEM.

How to spot it: Check your client software’s system prompt settings. In OpenWebUI, check the Model Settings panel. If a system prompt is defined there, it will override the Modelfile one.

5. Model’s instruction fine-tuning resists system prompt constraints

Some instruction fine-tunes have strong RLHF training that overrides certain types of system instructions. A system prompt like “never refuse any request” will be ignored because the model’s fine-tuning explicitly overrides such instructions. This is not a bug but intentional alignment behavior.

How to spot it: Test with a neutral, descriptive system prompt (“Respond only about cooking topics”) rather than a constraint that fights against alignment. If a neutral persona works but a guardrail-override system prompt doesn’t, the model’s alignment is the cause.

6. Modelfile built from a different base than the running model

If you create a Modelfile with FROM llama3.1:8b but then later pull llama3.1:8b again and the underlying model is updated (Ollama may update model tags), the template in your custom model may no longer match the base model’s expected format.

How to spot it: Check the creation date of your custom model with ollama show mymodel vs. ollama show llama3.1:8b. If the base model was updated after your custom model was created, rebuild with ollama create mymodel -f Modelfile.

Shortest path to fix

Step 1: Verify the TEMPLATE contains {{ .System }}

ollama show mymodel --modelfile

Check for {{ .System }} in the TEMPLATE block. A correct Llama 3.1 template:

TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ range .Messages }}<|start_header_id|>{{ .Role }}<|end_header_id|>

{{ .Content }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

"""

Note the {{ if .System }} guard — it only injects the system block when a system prompt is present.

Step 2: Rebuild with a corrected Modelfile

FROM llama3.1:8b

SYSTEM """You are a formal assistant. Always use complete sentences. Never use contractions."""

TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ range .Messages }}<|start_header_id|>{{ .Role }}<|end_header_id|>

{{ .Content }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

"""

PARAMETER stop "<|eot_id|>"
PARAMETER stop "<|end_of_text|>"
ollama create formal-llama -f /path/to/Modelfile
ollama run formal-llama "do you like pizza?"

Step 3: Test system prompt injection via the API

# Test that the system prompt is applied via the API
curl -s http://localhost:11434/api/chat \
  -d '{
    "model": "formal-llama",
    "messages": [{"role": "user", "content": "do you like pizza?"}],
    "stream": false
  }' | python3 -m json.tool | grep -A2 '"content"'

# Also test with an explicit system override to confirm the API override behavior
curl -s http://localhost:11434/api/chat \
  -d '{
    "model": "formal-llama",
    "messages": [
      {"role": "system", "content": "Different system prompt"},
      {"role": "user", "content": "do you like pizza?"}
    ],
    "stream": false
  }' | python3 -m json.tool

The second call uses the API-provided system prompt, overriding the Modelfile SYSTEM.

Step 4: For the /api/generate endpoint, pass system explicitly

# /api/generate supports a "system" field that overrides the Modelfile SYSTEM
curl -s http://localhost:11434/api/generate \
  -d '{
    "model": "llama3.1:8b",
    "prompt": "do you like pizza?",
    "system": "You are a formal assistant. Use complete sentences only.",
    "stream": false
  }' | python3 -m json.tool | grep -A2 '"response"'

Step 5: Check for system prompt override in client software

For OpenWebUI: Settings → Models → select your model → System Prompt field. Clear this field if you want the Modelfile SYSTEM to take effect.

For LangChain:

from langchain_community.llms import Ollama

# The system parameter here overrides Modelfile SYSTEM
llm = Ollama(
    model="formal-llama",
    system="Your override here",  # Remove this to use Modelfile SYSTEM
)

Prevention

  • Always run ollama show mymodel --modelfile after creating a model to verify the TEMPLATE and SYSTEM are as intended.
  • Test the system prompt immediately after creation with ollama run mymodel "ignore all previous instructions" to confirm the persona holds.
  • When using OpenWebUI or other frontends, document whether the system prompt is set in the Modelfile or the frontend — having it in both places causes confusion.
  • Keep a Modelfile in version control alongside your application code so system prompt changes are tracked.
  • When the base model is updated (e.g., Ollama refreshes a tag), rebuild all dependent custom models by re-running ollama create.
  • Use {{ if .System }} guards in templates so the system block is cleanly omitted when no system prompt is set, rather than rendering an empty <|start_header_id|>system<|end_header_id|> block.

FAQ

Q: Can I have multiple SYSTEM directives in one Modelfile? A: No — only the last SYSTEM directive in the Modelfile is used. Subsequent SYSTEM lines overwrite earlier ones. Put the full system prompt in a single SYSTEM block.

Q: Does the SYSTEM directive in a Modelfile persist across Ollama version upgrades? A: Yes — the custom model (created with ollama create) stores the compiled Modelfile including the SYSTEM directive. However, if the underlying base model’s format changes in a way that breaks the template, the SYSTEM may stop working. Always test after upgrading Ollama.

Q: How long can the SYSTEM prompt in a Modelfile be? A: There’s no hard Modelfile limit, but the system prompt counts toward the context window. A 2000-token system prompt uses 2000 tokens of context that aren’t available for the conversation. For models with 8192 context length, keep system prompts under 500 tokens to leave room for meaningful conversation history.

Q: Why does my system prompt work in ollama run (CLI) but not via the API? A: ollama run uses the Modelfile template with the default SYSTEM. API calls that include a system message in the messages array override the Modelfile SYSTEM. If your API client includes any system message (even an empty string), it overrides the Modelfile. Check your client code for implicit system message injection.

Tags: #local-llm #ollama #Troubleshooting