You kick off a LangGraph pipeline: Agent A researches a codebase and writes a detailed analysis, then hands off to Agent B to implement the changes. Agent B starts fresh, asks clarifying questions Agent A already answered, and produces output that contradicts decisions made three steps back. Or in an AutoGen GroupChat, the coding assistant ignores the planner’s chosen architecture because the message thread was summarized and key constraints got stripped. The handoff boundary is a lossy compression point — if you haven’t explicitly serialized state, downstream agents are flying blind.
Common causes
1. Context passed as a trimmed summary instead of structured state
The most common culprit. The orchestrator condenses Agent A’s output to fit the next model’s context window, and the summarization loses specifics: file paths, chosen libraries, rejected alternatives, error messages. Agent B receives “analyze authentication issues” instead of “line 47 of src/auth/jwt.ts uses HS256 with a hardcoded salt — switch to RS256 with env-loaded keys.”
How to spot it: Compare the raw output of Agent A against what Agent B actually received in its first message. If the handoff message is a prose paragraph where Agent A’s output was structured JSON or a code block, compression happened.
2. Stateless tool design — no shared memory store
In frameworks like CrewAI or AutoGen, agents default to passing data through chat messages. Long tool outputs (file reads, test logs, API responses) exceed what fits cleanly in a message, get truncated, and critical lines fall off the end. There is no external store being written to.
How to spot it: Search your agent framework’s message list for truncation markers like ... [truncated], [output clipped], or sudden silent cutoffs. Count the characters in each handoff message and compare against the model’s context limit.
3. Prompt template doesn’t include a “prior decisions” slot
The next agent’s system prompt has no placeholder for accumulated context. The orchestrator calls it with system_prompt.format(task=task) and forgets to inject prior_decisions, constraints, or artifacts. The agent starts from scratch by design.
How to spot it: Open the prompt template for every agent in the pipeline. If none of them reference a context, prior_decisions, or memory placeholder variable, context injection is missing entirely.
4. Agent framework resets conversation history on each invocation
Some orchestration frameworks — especially when using stateless Lambda or Cloud Run execution — create a fresh agent instance per invocation. Each agent call has zero conversation history. Any context must be explicitly passed in the input payload; nothing is implicit.
How to spot it: Add a print(len(agent.memory.messages)) (LangChain) or equivalent before each agent call. If it always prints 0 or 1, history isn’t persisting.
5. Race condition in async pipelines — out-of-order context arrival
In Temporal workflows or Inngest async steps, Agent B may start executing before Agent A’s final artifact write has completed. It reads a partial or empty context store and proceeds with stale/empty context.
How to spot it: Check workflow step dependencies. If Agent B’s step lists Agent A’s step as optional or doesn’t await it explicitly, the dependency isn’t enforced.
6. Serialization format mismatch between agents
Agent A writes context as a Python dataclass or Pydantic model. The orchestrator serializes it to JSON but loses fields that aren’t JSON-serializable (datetimes become strings, enums become ints, nested objects flatten). Agent B deserializes a degraded version.
How to spot it: Diff the object Agent A writes against the object Agent B reads. Any field that changed type or disappeared is a serialization casualty.
Shortest path to fix
Step 1: Add a structured context object to every handoff
Replace freeform message passing with a typed handoff envelope:
from dataclasses import dataclass, asdict
from typing import Any
@dataclass
class HandoffContext:
task_id: str
goal: str
decisions: list[dict] # [{"decision": "...", "rationale": "..."}]
artifacts: dict[str, str] # {"filename": "content or path"}
constraints: list[str]
prior_errors: list[str]
# Serialize to JSON for the next agent
import json
payload = json.dumps(asdict(ctx), default=str)
Pass this as the first user message or inject it into the system prompt via a {handoff_context} slot.
Step 2: Write large artifacts to a shared store, pass references
Never inline file contents into a message. Write them to a shared store and pass the key:
import uuid, redis
r = redis.Redis()
def store_artifact(content: str) -> str:
key = f"artifact:{uuid.uuid4()}"
r.set(key, content, ex=3600) # 1-hour TTL
return key
# Agent A writes:
key = store_artifact(analysis_text)
handoff.artifacts["analysis"] = key
# Agent B reads:
analysis = r.get(handoff.artifacts["analysis"]).decode()
Redis, S3, or even a temp file on a shared volume all work. The key principle: messages carry references, not payloads.
Step 3: Audit every prompt template for a context injection slot
# Find agent prompt files missing a context placeholder
grep -rL "{context}\|{prior_decisions}\|{handoff}" ./prompts/ ./agents/
For each file found, add a slot:
You are continuing work started by a prior agent.
Prior context:
{handoff_context}
Your task:
{task}
Step 4: Enforce handoff ordering in your orchestration layer
In LangGraph:
# Explicit edge — B cannot start until A completes
graph.add_edge("agent_a", "agent_b")
# NOT: graph.add_conditional_edges with a default fallthrough
In Temporal:
analysis = await workflow.execute_activity(agent_a_activity, task)
# Await explicitly — don't fire agent_b_activity concurrently
result = await workflow.execute_activity(agent_b_activity, analysis)
Step 5: Log the handoff payload for every run
import logging
logger = logging.getLogger("handoff")
def handoff(ctx: HandoffContext, next_agent: str):
logger.info("HANDOFF to %s: %s", next_agent, json.dumps(asdict(ctx), default=str))
# proceed
This creates a searchable audit trail. When context loss happens, you can diff what was sent vs. what was received.
Prevention
- Define a typed
HandoffContextschema before writing any agent — treat it like an API contract between agents. - Store large artifacts externally (Redis, S3, disk); pass only keys or URIs in agent messages.
- Add a
{handoff_context}slot to every agent’s system prompt — even the first agent, so the slot is always present when you add agents upstream later. - Set explicit step dependencies in your orchestration layer; never rely on timing or ordering-by-convention.
- Write integration tests that verify Agent B receives all fields Agent A emits — not just unit-test each agent in isolation.
- Log full handoff payloads in development; add a compact digest in production for diffing.
- Keep a “decisions log” that every agent appends to rather than replaces — downstream agents see all prior reasoning.
- Use Pydantic or a schema validator to catch serialization loss at the boundary rather than inside Agent B.
FAQ
Q: Does LangGraph handle context passing automatically?
A: LangGraph passes the full State object between nodes, so structural state survives if you define it in the TypedDict. It does not automatically persist to external storage between workflow runs — you need a checkpointer (e.g., SqliteSaver) for that.
Q: Our agents use different models — does that affect context loss? A: Yes. Different models have different context windows. If Agent A uses Claude with a 200K context and Agent B uses GPT-4o with a 128K context, the orchestrator may silently truncate the handoff to fit. Always check the target model’s limit when sizing handoff payloads.
Q: How big is too big for an inline handoff message? A: Keep inline messages under 2,000 tokens. Anything larger should be stored externally and referenced by key. This keeps handoff messages fast to inspect, avoids truncation, and makes logging practical.
Q: Can I use a vector store for handoff context? A: You can, but it introduces retrieval non-determinism — Agent B may not recall the exact constraints Agent A set. For hard constraints (architecture decisions, rejected options, error signatures), use structured JSON in a reliable key-value store. Use vector search only for large reference corpora where fuzzy retrieval is acceptable.
Related
- Agent State Desyncs After Restart
- Agent Skipped a Required Validation Step
- Agent Orchestrator Deadlocks Waiting on Each Other
- Two Parallel Agents Edit the Same File
- Shared Memory Corrupted by Overlapping Agent Writes
- Claude Code Context Broken
- Cursor Composer Context Loss
- Cycle in Agent Call Graph Goes Undetected