Documentation Index
Fetch the complete documentation index at: https://to11.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Direct Ingestion
The gateway instruments LLM calls automatically. For custom spans — non-LLM operations, internal pipeline steps, or enriching traces from your own services — you can send OTLP data directly to the to11 OTel Collector.When to use direct ingestion
- Custom spans for non-LLM operations — data preprocessing, post-processing, or any application logic you want visible alongside LLM calls.
- Enriching gateway traces with application-side timing — capture your full RAG pipeline end-to-end, including database queries and retrieval steps.
- Agent lifecycle spans — operations like
create_agentor session management that are better emitted from the client than the gateway. - Metrics from your own services — send application metrics alongside gateway metrics for a unified view.
Prerequisites
- A to11 API key with
otel:writescope. - An OpenTelemetry SDK in your language of choice.
Exchange an API key for a collector token
The collector authenticates via short-lived JWT tokens. Exchange your API key for a token before configuring your SDK:Tokens expire after 15 minutes. Refresh by calling the exchange endpoint again before expiry.
Configure your OTel SDK
The collector accepts OTLP on two endpoints:| Protocol | Endpoint | Port |
|---|---|---|
| gRPC | localhost:4317 | 4317 |
| HTTP | localhost:4318/v1/traces | 4318 |
Send spans
Correlate with gateway traces
To place your custom spans in the same trace as gateway LLM calls, propagate the W3Ctraceparent header. See Distributed Tracing for details.
Collector pipeline
When spans arrive at the collector, they pass through several stages before reaching storage:- OTLP receivers — Spans enter via gRPC (
:4317) or HTTP (:4318). - OIDC auth — The collector validates the JWT and extracts
project.idfrom thesubclaim. - Memory limiter — Caps collector memory at 512 MB to protect against burst traffic.
- Batch processor — Groups spans for efficient writes (1 s timeout, 2048 batch size).
- Filter — Drops any spans that lack a
project.idresource attribute. - ClickHouse exporter — Writes to the
otel_tracestable with a 90-day TTL.
For the full observability stack setup (Grafana, Tempo, ClickHouse), see Self-Hosted Observability.