Documentation Index
Fetch the complete documentation index at: https://to11.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Weighted Routing
To distribute traffic across providers or API keys, configure a route with theweighted strategy. The gateway selects a target at random, proportional to each target’s weight value.
Use cases
- Cost optimisation — send most traffic to a cheaper endpoint, overflow to a premium one.
- Gradual migration — shift 10 % of traffic to a new provider and increase the share over time.
- Load distribution — spread requests across multiple API keys for the same model to avoid rate limits.
Prerequisites
- At least two providers or two API keys for the same model.
- Each provider declared in a
[providers.*]block with the model in itsmodelslist.
Full configuration example
The following config splitsgpt-4o traffic 70/30 between OpenAI and Azure OpenAI.
How weights work
The gateway uses weighted random selection. Each request picks a target with probability proportional to its weight relative to the total.Default weight
Whenweight is omitted from a target, it defaults to 1. Two targets without explicit weights receive a 50/50 distribution.
Testing it
Send a few requests and observe the gateway logs to see which target is selected on each attempt.Limitations
Next steps
Fallback Routing
Automatic failover when a provider is down.
Routing Overview
Managed vs passthrough routing and the resolution flow.
Configuration
Full TOML reference for all gateway settings.