Skip to main content

Documentation Index

Fetch the complete documentation index at: https://to11.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Direct Ingestion

The gateway instruments LLM calls automatically. For custom spans — non-LLM operations, internal pipeline steps, or enriching traces from your own services — you can send OTLP data directly to the to11 OTel Collector.

When to use direct ingestion

  • Custom spans for non-LLM operations — data preprocessing, post-processing, or any application logic you want visible alongside LLM calls.
  • Enriching gateway traces with application-side timing — capture your full RAG pipeline end-to-end, including database queries and retrieval steps.
  • Agent lifecycle spans — operations like create_agent or session management that are better emitted from the client than the gateway.
  • Metrics from your own services — send application metrics alongside gateway metrics for a unified view.

Prerequisites

  • A to11 API key with otel:write scope.
  • An OpenTelemetry SDK in your language of choice.

Exchange an API key for a collector token

The collector authenticates via short-lived JWT tokens. Exchange your API key for a token before configuring your SDK:
curl -X POST http://localhost:4500/v1/ingest-token/exchange \
  -H "Content-Type: application/json" \
  -d '{
    "credential": "your-api-key"
  }'
Response:
{
  "token": "eyJ...",
  "projectId": "proj_abc123",
  "expiresAt": "2026-03-25T17:30:00Z"
}
Tokens expire after 15 minutes. Refresh by calling the exchange endpoint again before expiry.

Configure your OTel SDK

The collector accepts OTLP on two endpoints:
ProtocolEndpointPort
gRPClocalhost:43174317
HTTPlocalhost:4318/v1/traces4318
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource

resource = Resource.create({
    "service.name": "my-rag-pipeline",
    "project.id": "proj_abc123",  # Required — spans without this are dropped
})

exporter = OTLPSpanExporter(
    endpoint="localhost:4317",
    headers={"authorization": f"Bearer {token}"},
    insecure=True,  # TLS not required for local development
)

provider = TracerProvider(resource=resource)
provider.add_span_processor(BatchSpanProcessor(exporter))
trace.set_tracer_provider(provider)
The project.id resource attribute is required. The collector drops spans that do not carry it. When using JWT authentication, the collector also stamps project.id from the JWT sub claim automatically.

Send spans

tracer = trace.get_tracer("my-rag-pipeline")

with tracer.start_as_current_span("document-preprocessing") as span:
    span.set_attribute("pipeline.stage", "preprocessing")
    span.set_attribute("document.count", 42)
    # ... your processing code ...

Correlate with gateway traces

To place your custom spans in the same trace as gateway LLM calls, propagate the W3C traceparent header. See Distributed Tracing for details.
from opentelemetry.propagate import inject

headers = {}
inject(headers)  # Adds traceparent to headers dict

# Pass these headers to the gateway
response = requests.post(
    "http://localhost:4000/v1/chat/completions",
    headers={
        **headers,
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
    },
    json={"model": "gpt-4o", "messages": [...]},
)

Collector pipeline

When spans arrive at the collector, they pass through several stages before reaching storage:
  1. OTLP receivers — Spans enter via gRPC (:4317) or HTTP (:4318).
  2. OIDC auth — The collector validates the JWT and extracts project.id from the sub claim.
  3. Memory limiter — Caps collector memory at 512 MB to protect against burst traffic.
  4. Batch processor — Groups spans for efficient writes (1 s timeout, 2048 batch size).
  5. Filter — Drops any spans that lack a project.id resource attribute.
  6. ClickHouse exporter — Writes to the otel_traces table with a 90-day TTL.
Your App (OTel SDK)
    |
    v
OTLP Receiver (gRPC :4317 / HTTP :4318)
    |
    v
OIDC Auth ──> reject if JWT invalid
    |
    v
Memory Limiter (512 MB)
    |
    v
Batch Processor (2048 spans / 1s)
    |
    v
Filter ──> drop if no project.id
    |
    v
ClickHouse (otel_traces, 90-day TTL)
For the full observability stack setup (Grafana, Tempo, ClickHouse), see Self-Hosted Observability.