Distributed Tracing

Group multiple LLM calls — or an entire agent-to-agent chain — under one distributed trace by propagating the W3C Trace Context traceparent header through the gateway. Every request that shares a trace_id appears as a single trace in your observability backend — the whole chain on one ticket.

How the gateway propagates traces

Your application sends a request with a traceparent header.
The gateway extracts the trace context and creates its HTTP and GenAI spans as children of that trace.
The gateway injects traceparent into the upstream provider request, so the application, gateway, and provider share one trace ID.
The gateway returns traceparent in the response headers so your application can continue the trace.

Your App (traceparent: 00-<trace_id>-<span_id>-01)
    |
    v
to11 gateway
    |--- Creates HTTP span (child of your trace)
    |--- Creates GenAI span (child of HTTP span)
    |--- Injects traceparent into upstream request
    |
    v
LLM Provider (receives traceparent)

Group multiple LLM calls under one trace

If your application uses the OpenTelemetry SDK, wrap multiple gateway calls in a single span. The OTel HTTP instrumentation automatically propagates the traceparent header.

from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.instrumentation.requests import RequestsInstrumentor
import requests

# Set up OTel with OTLP export to the same collector as the gateway
provider = TracerProvider()
provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter()))
trace.set_tracer_provider(provider)

# Auto-instrument HTTP client to propagate traceparent
RequestsInstrumentor().instrument()

tracer = trace.get_tracer("my-agent")

# All LLM calls inside this span share the same trace_id
with tracer.start_as_current_span("summarize-document"):
    requests.post("https://gw.to11.ai/v1/chat/completions", json={
        "model": "gpt-4o",
        "messages": [{"role": "user", "content": "Summarize this document..."}]
    }, headers={
        "x-to11-authorization": "Bearer <your to11 API key>",
        "Authorization": "Bearer <your OpenAI API key>",
    })

    requests.post("https://gw.to11.ai/v1/chat/completions", json={
        "model": "gpt-4o",
        "messages": [{"role": "user", "content": "Extract key entities..."}]
    }, headers={
        "x-to11-authorization": "Bearer <your to11 API key>",
        "Authorization": "Bearer <your OpenAI API key>",
    })

Both LLM calls appear as children of the summarize-document span in your trace viewer.

Agent-to-agent tracing

When one agent delegates to another, pass the traceparent through to maintain a single end-to-end trace:

Agent A (trace_id=abc123)
  ├─ LLM call 1 → Gateway span (child of abc123)
  ├─ LLM call 2 → Gateway span (child of abc123)
  └─ passes traceparent to Agent B
Agent B (same trace_id=abc123)
  ├─ LLM call 3 → Gateway span (child of abc123)
  └─ LLM call 4 → Gateway span (child of abc123)

All four LLM calls land in the same trace, giving end-to-end visibility across agents.

To enrich these traces with session metadata or client-side operation context (tool execution, retrieval, agent steps), see Context Propagation.

Manual traceparent without an OTel SDK

If you don’t use an OpenTelemetry SDK, you can still pass a traceparent header manually. The format is:

traceparent: 00-<32 hex trace_id>-<16 hex span_id>-01

For example:

# Generate a trace ID and span ID
TRACE_ID=$(openssl rand -hex 16)
SPAN_ID=$(openssl rand -hex 8)

# First call
curl https://gw.to11.ai/v1/chat/completions \
  -H "x-to11-authorization: Bearer $TO11_API_KEY" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -H "traceparent: 00-${TRACE_ID}-${SPAN_ID}-01" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Step 1..."}]}'

# Second call with the same trace_id (different span_id)
SPAN_ID2=$(openssl rand -hex 8)
curl https://gw.to11.ai/v1/chat/completions \
  -H "x-to11-authorization: Bearer $TO11_API_KEY" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -H "traceparent: 00-${TRACE_ID}-${SPAN_ID2}-01" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Step 2..."}]}'

Both requests share the same trace_id and appear in one trace. To chain further calls, read the traceparent header from each gateway response — it carries the gateway’s own span ID — and forward it as the parent of the next request.

​How the gateway propagates traces

​Group multiple LLM calls under one trace

​Agent-to-agent tracing

​Manual traceparent without an OTel SDK

How the gateway propagates traces

Group multiple LLM calls under one trace

Agent-to-agent tracing

Manual traceparent without an OTel SDK