You are debugging a LangSmith trace for a Claude Code or LangGraph agent. The agent’s final output says “I deleted the old migration files and ran the schema update,” but scanning the trace you see zero delete_file calls and no database tool invocations. The action happened — the files are gone, the schema changed — but the trace has no record of how. Now you cannot audit what was deleted, cannot replay the run, and cannot determine whether the agent followed the correct procedure. Missing tool calls in a trace are both a debugging nightmare and a compliance problem.
Common causes
1. Tool call executed outside the traced code path
The most common cause. A utility function — maybe a helper that “just runs a quick shell command” — is called directly from Python without going through the agent’s tool-executor layer. It bypasses the tracer entirely. The effect happens, the trace records nothing.
How to spot it: List every function in your codebase that performs side effects (file writes, subprocess calls, HTTP requests, database mutations). For each one, check whether it is invoked through the agent’s tool_executor or directly as a Python function call. Any direct call is invisible to the tracer.
2. Tool call happens inside an async task that the tracer doesn’t capture
A tool fires an asyncio.create_task() that runs the actual work asynchronously. The tracer wraps the synchronous call stack but doesn’t follow the async task into a separate coroutine. The outer call appears in the trace as an instant return; the actual work is invisible.
How to spot it: Search for asyncio.create_task(), threading.Thread().start(), or concurrent.futures.submit() inside any function that performs side effects. Fire-and-forget async patterns reliably escape synchronous tracers.
3. Exception swallowed before the tracer records the tool call
The tool call raises an exception mid-execution. A broad except Exception: pass clause swallows it before the tracer’s on_tool_end callback fires. The tracer only records the on_tool_start event (if that), leaving a half-entry or no entry in the trace.
How to spot it: Search for except Exception: pass or except Exception: continue patterns in tool wrappers. If the tracer callback for on_tool_end is inside a try block that can be skipped, incomplete entries will appear.
4. Tool is implemented as a plain function, not registered with the framework
In LangGraph or CrewAI, the agent has access to both registered tools (visible in the trace) and unregistered Python functions the agent can call via code execution. The agent writes Python that calls the unregistered function directly. The framework traces tool invocations, not arbitrary code execution.
How to spot it: Check whether the agent has execute_code or run_python capabilities. Any code execution capability allows the agent to call arbitrary Python without going through the tool registry — and without appearing in the trace.
5. LangSmith / tracing SDK is version-pinned and missing newer tool types
Your observability stack uses LangSmith SDK 0.0.75 but the agent now uses a newer tool type introduced in 0.1.0. The old SDK version doesn’t know about the new tool type and silently drops those events from the trace.
How to spot it: Compare the LangSmith (or OpenTelemetry, or your tracing SDK) version in your requirements.txt against the release notes for the version that added the tool types you use. Version mismatches cause silent event drops.
6. Tracer uses sampling and dropped the tool call span
The production tracer samples at 10% to reduce cost. A low-probability but critical tool call (e.g., a destructive delete operation) gets sampled out of the trace. It executes but leaves no record.
How to spot it: Check the tracer’s sampling configuration. If sampling rate is below 100%, destructive or security-sensitive tool calls will have incomplete trace coverage.
Shortest path to fix
Step 1: Instrument all side-effect functions at the source
import functools
from your_tracer import trace_event
def traced_side_effect(tool_name: str):
def decorator(fn):
@functools.wraps(fn)
def wrapper(*args, **kwargs):
trace_event("tool_start", tool=tool_name, inputs={"args": args, "kwargs": kwargs})
try:
result = fn(*args, **kwargs)
trace_event("tool_end", tool=tool_name, output=result)
return result
except Exception as e:
trace_event("tool_error", tool=tool_name, error=str(e))
raise
return wrapper
return decorator
@traced_side_effect("delete_file")
def delete_file(path: str):
os.remove(path)
Apply this decorator to every function that writes files, runs subprocesses, calls APIs, or mutates databases.
Step 2: Wrap async tasks to propagate trace context
import asyncio
from opentelemetry import trace as otel_trace
tracer = otel_trace.get_tracer(__name__)
async def traced_async_tool(tool_name: str, coro):
with tracer.start_as_current_span(tool_name) as span:
try:
result = await coro
span.set_attribute("result", str(result)[:500])
return result
except Exception as e:
span.record_exception(e)
raise
# Usage — don't fire-and-forget; trace and await
result = await traced_async_tool("run_migration", run_migration_coro())
Step 3: Never swallow exceptions in tracer callbacks
# WRONG
def on_tool_end(self, output, **kwargs):
try:
self.log_tool_output(output)
except Exception:
pass # silently drops the trace event
# CORRECT
def on_tool_end(self, output, **kwargs):
try:
self.log_tool_output(output)
except Exception as e:
logger.error("Tracer failed to record tool_end: %s", e)
# Do NOT swallow — at minimum, log so you know tracing failed
Step 4: Require all tools to go through the registry
# Prevent arbitrary code execution from bypassing tool tracing
ALLOWED_TOOLS = {"delete_file", "write_file", "run_bash", "call_api"}
def execute_tool(tool_name: str, inputs: dict):
if tool_name not in ALLOWED_TOOLS:
raise SecurityError(f"Unregistered tool call blocked: {tool_name!r}")
tool_fn = TOOL_REGISTRY[tool_name]
return tool_fn(**inputs)
If the agent has code-execution capabilities, add a code-execution audit log that records every call to a function in a sensitive module.
Step 5: Upgrade the tracing SDK and pin it explicitly
# Check current version
pip show langsmith | grep Version
# Upgrade to latest
pip install --upgrade langsmith
# Pin it to prevent silent regressions
# requirements.txt:
# langsmith>=0.1.50,<0.2.0
After upgrading, replay a representative run and verify that tool call counts in the trace match your expectations.
Step 6: Set sampling to 100% for destructive tool calls
# OpenTelemetry: use a custom sampler that always samples destructive calls
from opentelemetry.sdk.trace.sampling import ParentBased, ALWAYS_ON, TraceIdRatioBased
class DestructiveAlwaysSampler(ParentBased):
def should_sample(self, parent_context, trace_id, name, *args, **kwargs):
if any(kw in name for kw in ("delete", "drop", "truncate", "destroy")):
return ALWAYS_ON.should_sample(parent_context, trace_id, name, *args, **kwargs)
return super().should_sample(parent_context, trace_id, name, *args, **kwargs)
Prevention
- Apply a tracing decorator to every function that produces side effects — file writes, subprocess calls, API calls, and database mutations.
- Propagate trace context explicitly into async tasks; never fire-and-forget a coroutine that performs side effects.
- Never swallow exceptions in tracer callbacks — log the failure and alert; a silent tracer failure is as bad as no tracer at all.
- Require all side-effecting tool calls to go through the registered tool executor, not as direct Python function calls from agent code.
- Set sampling to 100% for all destructive, security-sensitive, or irreversible tool calls; use reduced sampling only for read-only operations.
- Include the trace ID in every log line so you can correlate log events with trace spans when the trace itself is incomplete.
- Pin tracing SDK versions and test trace completeness after any SDK upgrade.
- Add an automated test that executes a known sequence of tool calls and asserts that all of them appear in the resulting trace.
FAQ
Q: Can I reconstruct what a tool call did if it’s missing from the trace?
A: Sometimes — from side effects. Check git history (git log --all --diff-filter=D -- path), database audit logs, and OS-level file access logs (fs_usage on macOS, auditd on Linux). These give you evidence of what happened but not the agent’s intent or inputs.
Q: Does LangSmith capture everything automatically?
A: LangSmith captures all calls that go through LangChain’s tool and chain abstractions. It does not capture direct Python function calls, subprocess invocations, or code executed via exec(). Any code path outside the LangChain abstraction layer is invisible unless you add manual instrumentation.
Q: How do I trace tool calls across multiple services (microservices architecture)?
A: Use distributed tracing with W3C TraceContext propagation. Pass the traceparent header on every HTTP call between services. OpenTelemetry with a Jaeger or Tempo backend is the standard solution. All services in the call graph emit spans linked by the same trace ID.
Q: What is the performance overhead of 100% sampling on destructive tools? A: Negligible for most workloads. A destructive tool call (file delete, database drop) is inherently expensive; the tracing overhead is typically under 1ms per span. The business cost of an untraced destructive operation far outweighs any performance concern.