Documentation Index
Fetch the complete documentation index at: https://to11.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Routing Overview
The gateway resolves every inbound request through a three-layer routing model. Each layer is independently useful and backward compatible — you opt in to higher layers as your deployment needs them.Three-layer routing model
| Layer | Routing key | Config | Gateway owns |
|---|---|---|---|
| L1: Passthrough | Model name | [providers.*] only | Nothing — proxies directly |
| L2: Managed routing | Model name | [routes.*] + [targets.*] | Credentials + routing strategy |
| L3: Function routing | Function name | [functions.*] | Credentials + routing + task abstraction |
models array can be used immediately — the caller supplies their own API key and the gateway forwards it. L2 adds gateway-owned credentials and traffic distribution strategies (single, weighted, fallback) for specific models. L3 introduces a task-level indirection layer where callers request a function name (e.g. summarize) instead of a model name.
Resolution flow
When a request arrives, the gateway checks for an explicit namespace prefix first. If none is present, it resolves top-down through the three layers.[providers.*] and [routes.*] — managed routing takes precedence when configured. Similarly, a function name that collides with a model name resolves to the function first.
Namespace prefixes
Explicit prefixes let callers target a specific routing layer, bypassing top-down resolution.| Prefix | Layer | Example | Resolution |
|---|---|---|---|
function::name | L3 | function::summarize | Direct function lookup |
route::name | L2 | route::balanced-gpt4o | Direct route lookup |
provider::model | L1 | openai::gpt-4o | Direct provider lookup |
| (no prefix) | Auto | gpt-4o or summarize | Top-down resolution |
Explicit prefixes prevent accidental shadowing. Without a prefix, adding a function named
summarize would shadow a provider model with the same name. Prefixes make the routing layer visible to the caller.Endpoint support
All three routing layers support all endpoint types.| Endpoint | L1 Passthrough | L2 Managed Route | L3 Function | Prefix Parsing |
|---|---|---|---|---|
| Chat completions | ✅ | ✅ | ✅ | ✅ |
| Responses / Messages | ✅ | ✅ | ✅ | ✅ |
| Embeddings | ✅ | ✅ | ✅ | ✅ |
| Audio speech | ✅ | ✅ | ✅ | ✅ |
| Audio transcription | ✅ | ✅ | ✅ | ✅ |
| Image generation | ✅ | ✅ | ✅ | ✅ |
| Count tokens | ✅ | — | — | — |
| Files | ✅ | — | — | — |
endpoint field that declares which endpoint type they serve. Sending a request to the wrong endpoint returns 400.
Endpoint types
Functions and routes declare which endpoint they serve via theendpoint field:
| Value | HTTP Endpoint | Example models |
|---|---|---|
chat | /v1/chat/completions | gpt-4o, claude-sonnet-4-6 |
embeddings | /v1/embeddings | text-embedding-3-small |
audio_speech | /v1/audio/speech | tts-1, tts-1-hd |
audio_transcription | /v1/audio/transcriptions | whisper-1 |
image_generation | /v1/images/generations | dall-e-3 |
The
endpoint field is required. The gateway rejects functions and routes without it at startup.Routing strategies
| Strategy | Description | Available on |
|---|---|---|
single | Route to one target | Routes, Functions |
weighted | Random selection by weight | Routes, Functions |
fallback | Sequential failover with retry | Routes, Functions |
experiment | A/B test with weighted variants | Functions only |
Utility — no routing
| Endpoint | Purpose |
|---|---|
GET /v1/models | Synthetic model list from config |
GET /health | Health check |
Failover and retry
The gateway uses fast sequential failover combined with retry and exponential backoff. Thefallback strategy tries targets in declaration order — if the first target fails (connection error or 5xx), the next target is attempted immediately. When all targets have been tried, the first target is retried once more as a degraded-mode fallback.
Configure [routing.retry] for automatic retries with exponential backoff:
Scope and limitations
- No model remapping. The
modelfield in the request body is forwarded as-is to the upstream provider. If your target uses a different model name upstream, you must use the upstream name in the target configuration. - No health-aware weighted routing. The weighted strategy does not consult target health. An unhealthy target continues to receive its proportional share of traffic. Use the
fallbackstrategy for automatic failover.
Next steps
Passthrough
How L1 passthrough mode works.
Simple Routing
Set up gateway-owned credentials for a model.
Providers
Supported providers and format translation.
Configuration
Full TOML reference for all gateway settings.