> ## Documentation Index
> Fetch the complete documentation index at: https://to11.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Distributed Tracing

> How to group LLM calls and agent-to-agent communication under a single trace.

Group multiple LLM calls — or an entire agent-to-agent chain — under one distributed trace by propagating the [W3C Trace Context](https://www.w3.org/TR/trace-context/) `traceparent` header through the gateway. Every request that shares a `trace_id` appears as a single trace in your observability backend — the whole chain on one ticket.

## How the gateway propagates traces

1. Your application sends a request with a `traceparent` header.
2. The gateway extracts the trace context and creates its HTTP and GenAI spans as children of that trace.
3. The gateway injects `traceparent` into the upstream provider request, so the application, gateway, and provider share one trace ID.
4. The gateway returns `traceparent` in the response headers so your application can continue the trace.

```text theme={null}
Your App (traceparent: 00-<trace_id>-<span_id>-01)
    |
    v
to11 gateway
    |--- Creates HTTP span (child of your trace)
    |--- Creates GenAI span (child of HTTP span)
    |--- Injects traceparent into upstream request
    |
    v
LLM Provider (receives traceparent)
```

## Group multiple LLM calls under one trace

If your application uses the OpenTelemetry SDK, wrap multiple gateway calls in a single span. The OTel HTTP instrumentation automatically propagates the `traceparent` header.

```python theme={null}
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.instrumentation.requests import RequestsInstrumentor
import requests

# Set up OTel with OTLP export to the same collector as the gateway
provider = TracerProvider()
provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter()))
trace.set_tracer_provider(provider)

# Auto-instrument HTTP client to propagate traceparent
RequestsInstrumentor().instrument()

tracer = trace.get_tracer("my-agent")

# All LLM calls inside this span share the same trace_id
with tracer.start_as_current_span("summarize-document"):
    requests.post("https://gw.to11.ai/v1/chat/completions", json={
        "model": "gpt-4o",
        "messages": [{"role": "user", "content": "Summarize this document..."}]
    }, headers={
        "x-to11-authorization": "Bearer <your to11 API key>",
        "Authorization": "Bearer <your OpenAI API key>",
    })

    requests.post("https://gw.to11.ai/v1/chat/completions", json={
        "model": "gpt-4o",
        "messages": [{"role": "user", "content": "Extract key entities..."}]
    }, headers={
        "x-to11-authorization": "Bearer <your to11 API key>",
        "Authorization": "Bearer <your OpenAI API key>",
    })
```

Both LLM calls appear as children of the `summarize-document` span in your trace viewer.

## Agent-to-agent tracing

When one agent delegates to another, pass the `traceparent` through to maintain a single end-to-end trace:

```text theme={null}
Agent A (trace_id=abc123)
  ├─ LLM call 1 → Gateway span (child of abc123)
  ├─ LLM call 2 → Gateway span (child of abc123)
  └─ passes traceparent to Agent B
Agent B (same trace_id=abc123)
  ├─ LLM call 3 → Gateway span (child of abc123)
  └─ LLM call 4 → Gateway span (child of abc123)
```

All four LLM calls land in the same trace, giving end-to-end visibility across agents.

<Info>
  To enrich these traces with session metadata or client-side operation context (tool execution, retrieval, agent steps), see [Context Propagation](/instrument/context-propagation).
</Info>

## Manual traceparent without an OTel SDK

If you don't use an OpenTelemetry SDK, you can still pass a `traceparent` header manually. The format is:

```
traceparent: 00-<32 hex trace_id>-<16 hex span_id>-01
```

For example:

```bash theme={null}
# Generate a trace ID and span ID
TRACE_ID=$(openssl rand -hex 16)
SPAN_ID=$(openssl rand -hex 8)

# First call
curl https://gw.to11.ai/v1/chat/completions \
  -H "x-to11-authorization: Bearer $TO11_API_KEY" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -H "traceparent: 00-${TRACE_ID}-${SPAN_ID}-01" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Step 1..."}]}'

# Second call with the same trace_id (different span_id)
SPAN_ID2=$(openssl rand -hex 8)
curl https://gw.to11.ai/v1/chat/completions \
  -H "x-to11-authorization: Bearer $TO11_API_KEY" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -H "traceparent: 00-${TRACE_ID}-${SPAN_ID2}-01" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Step 2..."}]}'
```

Both requests share the same `trace_id` and appear in one trace. To chain further calls, read the `traceparent` header from each gateway response — it carries the gateway's own span ID — and forward it as the parent of the next request.