Architecture
Routing
The gateway resolves each request top-down: a function match wins first, then a named route match, then a direct provider match; otherwise the request passes straight through. Functions are managed by to11, so the two modes you set up yourself are passthrough and named routes. You opt into managed routing as you need it.| Mode | How you call it | What the gateway does |
|---|---|---|
| Passthrough | A bare model name, or provider::model | Proxies straight through to a provider |
| Managed routing | A named route, called as route::<name> | Applies a routing strategy — direct, weighted split, fallback chain, or A/B experiment — across your connected providers |
Key capabilities
Cross-format routing
The gateway accepts requests in OpenAI, Anthropic, or xAI format and routes them to any connected provider. The response is returned in the caller’s native format — so you can send an OpenAI-format request, route it to Anthropic’s Claude, and get an OpenAI-format response back without changing your application.Streaming
When the caller’s format matches the upstream provider, streamed chunks are forwarded with zero-copy passthrough. When the formats differ, the stream is normalized and re-serialized into the caller’s format.Inline security
A security pipeline runs in the request path before anything is forwarded upstream: blocklist matching and PII detection today, with ML-based prompt-injection and content moderation coming. Blocked requests are rejected and never reach the provider.Telemetry
Every call produces an OpenTelemetry GenAI span — model, tokens, latency, finish reason, time-to-first-token for streams, and optional prompt and completion content. See Instrument and Observe.Next steps
Routing
Passthrough, direct, weighted, fallback, and A/B experiments.
Concepts
Providers, models, targets, and routes.
Environments
Separate production from staging.
Observe
See what each request did in production.