Self-Hosted Observability
Thedocker compose stack ships a complete observability backend. This guide shows you how to navigate it.
Architecture overview
to11 runs two independent telemetry pipelines. Each serves a different purpose and lands in a different backend.| Pipeline | What it captures | Storage | Query UI |
|---|---|---|---|
| Application | HTTP request spans, middleware timing, error rates, container logs | Tempo + Loki + Mimir | Grafana (:3001) |
| GenAI | LLM call spans, token usage, content capture, tool calls | ClickHouse | Web app (:3000) or ClickHouse directly |
Services and ports
After runningdocker compose -f docker-compose.production.yml --profile observability up -d, these observability services are available (to pin a specific build, prefix with IMAGE_TAG=sha-abc1234):
| Service | URL | Purpose |
|---|---|---|
| Grafana | localhost:3001 | Dashboards, trace explorer, log viewer |
| Tempo | localhost:3200 | Trace storage (queried via Grafana) |
| Loki | localhost:3101 | Log aggregation (queried via Grafana) |
| Mimir | localhost:9009 | Metrics storage / Prometheus-compatible |
| Alloy | localhost:12345 | OTel collector + log shipper |
| ClickHouse | localhost:8123 | GenAI telemetry storage |
| OTel GenAI Collector | localhost:4317 / localhost:4318 | Receives GenAI OTLP, writes to ClickHouse |
Exploring platform traces in Grafana
Grafana ships with pre-configured data sources for Tempo, Loki, and Mimir. No setup required.Open the Trace Explorer
- Open localhost:3001 (anonymous admin access is enabled)
- Click Explore in the left sidebar
- Select Tempo as the data source
Search by service
Use the Search tab to filter traces:- Service Name:
gatewayorto11-api - Span Name: filter by operation (e.g.
POST /v1/chat/completions) - Duration: find slow requests
- Status: filter for errors
TraceQL queries
Switch to the TraceQL tab for more powerful queries:Reading a trace waterfall
Click any trace to open the waterfall view. You’ll see:- Root span — the inbound HTTP request to the gateway
- Child spans — middleware processing, upstream provider call, response handling
- Span attributes — HTTP method, status code, URL path, timing details
- Events — errors, warnings, or other logged events within a span
Traces to logs
The Grafana Tempo data source is configured with Traces-to-Logs linking. When viewing a trace:- Click the Logs for this span button in the trace detail panel
- This jumps to Loki with the
trace_idpre-filled, showing all log lines emitted during that trace
Exploring logs in Grafana
- In Explore, select the Loki data source
- Use LogQL to query:
Exploring metrics in Grafana
- In Explore, select the Mimir data source
- Use PromQL to query span-derived metrics:
Querying GenAI telemetry in ClickHouse
GenAI spans (LLM calls, token counts, content capture) go to ClickHouse via the OTel GenAI Collector.Using the Web UI
The to11 web app at localhost:3000 provides a traces page per project. Navigate to a project and open the Traces tab to see:- Trace list with provider, model, tokens, cost, and status
- Trace detail view with span waterfall and attributes
Querying ClickHouse directly
Connect to ClickHouse atlocalhost:8123 (user: otel, password: otel):
Host-machine development (no Docker gateway)
When running the Rust gateway on your host machine instead of in Docker (for examplecargo run -p gateway), the infra services still run in Docker. The
port mappings differ slightly:
| Signal | Docker-to-Docker endpoint | Native (host) endpoint |
|---|---|---|
| App traces (Alloy) | alloy:4317 | localhost:14317 |
| App telemetry HTTP | alloy:4318 | localhost:14318 |
| GenAI OTLP (Collector) | otel-genai-collector:4317 | localhost:4317 |
docker/gateway/config.dev.toml which has these host endpoints pre-configured:
Troubleshooting
No traces appearing in Tempo
- Verify Alloy is running:
curl http://localhost:12345should return the Alloy UI - Check that
[telemetry].enabled = truein the gateway config - Look at Alloy logs:
docker compose -f docker-compose.production.yml logs alloy
No GenAI spans in ClickHouse
- Verify the OTel GenAI Collector is healthy:
curl http://localhost:13133/ - Check collector logs:
docker compose -f docker-compose.production.yml logs otel-genai-collector - Confirm ClickHouse is accepting writes:
docker compose -f docker-compose.production.yml logs clickhouse