Documentation Index
Fetch the complete documentation index at: https://to11.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Routes
A route is a named entry in[routes.*] that maps one or more model names to a set of targets with a selection strategy. When a caller sends a request for a model that matches a route, the gateway selects a target according to the strategy and proxies the request with gateway-owned credentials.
Fields
| Field | Type | Required | Description |
|---|---|---|---|
endpoint | string | Yes | Endpoint type: chat, embeddings, audio_speech, audio_transcription, or image_generation. |
models | array | Yes | Model names this route handles. A request whose model field matches any entry is routed here. |
strategy | string | Yes | Selection strategy: single, weighted, or fallback. |
targets | array | Yes* | Target names for simple (single-step) routes. Mutually exclusive with steps. |
steps | array | No | Multi-step fallback chain. Mutually exclusive with targets — only one may be specified. |
retry | table | No | Per-route retry configuration. Overrides the global [routing.retry] setting. |
Strategies
Each route declares a strategy that controls how the gateway picks a target.Single
Weighted
weight. Useful for gradual migrations (shift 10% of traffic to a new provider) or cost optimisation (route most traffic to a cheaper endpoint). Weights are relative integers — 70 and 30 produce the same distribution as 7 and 3.
Fallback
The fallback strategy does not maintain health state between requests. Each request starts from the first target in the list. For persistent provider outages, this means the first target is attempted (and fails) on every request before the fallback is reached. This trade-off keeps the routing layer stateless and predictable.
Multi-step routes
For more sophisticated failover, routes supportsteps — an ordered list where each step defines its own strategy and targets. Steps execute as a fallback chain: if all targets in step 1 are exhausted, step 2 is tried.
openai-east and openai-west by weight. If both fail, it falls back to azure-fallback.
Per-route retry
Each route can override the global retry configuration:retry is absent from a route, the global [routing.retry] configuration applies. When that is also absent, the defaults are max_retries = 2 and backoff_base_ms = 500.
Retries use exponential backoff: the delay before attempt n is backoff_base_ms * 2^(n-1) milliseconds.
How routes fit into the routing hierarchy
Routes are the second layer (L2) in the gateway’s routing system:Multi-endpoint routes
Routes support all endpoint types via theendpoint field:
text-embedding-3-small hits /v1/embeddings, the route matches and uses the gateway-managed credential instead of the caller’s key.
Routes match by both model name AND endpoint type. A chat route for
gpt-4o does not match embedding requests for the same model.Next steps
Functions
Named task aliases that decouple caller intent from model choice.
Targets
How targets pair models with gateway-owned credentials.
Routing Overview
Full routing system design: managed vs passthrough, resolution flow.