What a route matches on
A route matches on two things together:- A model name — the value of the
modelfield in the incoming request. - An endpoint kind —
chat,embeddings,audio_speech,audio_transcription, orimage_generation.
gpt-4o does not catch an embeddings request for the same model name.
Strategies
When you create a routing rule, you choose how to11 picks among its targets. Routes offer four strategies:Direct routing
A single target. Every matching request goes to the same model-and-credential pairing. This is what you get when a rule has exactly one target.Weighted split
Two or more targets, chosen at random in proportion to their weights. Useful for a gradual migration (shift a slice of traffic to a new provider) or cost shaping (send most traffic to a cheaper endpoint). In the dashboard you enter each target’s share as a percentage —70 and 30 send roughly seven requests to the first target for every three to the second.
Weighted split chooses purely by weight; it does not track the health of a target between requests. If you want failover when a target is unreachable, use a Fallback chain.
Fallback chain
Two or more targets tried in the order you declare them. to11 sends to the first target; on a connection error or a5xx response it moves to the next, and so on down the chain.
A/B experiment
Two or more model variants, each taking a share of traffic, so you can compare them on live requests. See A/B experiment.Where routes are built
You build routes in the dashboard’s AI Gateway → Routing section, where a rule pairs a model name and endpoint kind with a strategy and its targets. For step-by-step walkthroughs, see the routing how-tos: Direct routing, Weighted split, Fallback chain, and A/B experiment.Retries
When a request fails in a way to11 can retry, the gateway retries internally — by default up to 2 retries with a 500 ms base backoff that grows on each attempt. Retries happen inside the gateway; you do not configure them per request.How routes fit into routing
A route is the managed-routing layer (L2). to11 resolves an incoming request top-down:- Function (L3) — a named task (function) is matched first.
- Route (L2) — then a route, using a stored credential.
- Provider (L1) — then a direct provider lookup, passthrough with the caller’s own key.
404.
To call a route by a stable name and swap the model behind it without changing your application, give the route a routing identifier and invoke it as
route::<name>. See Route by a stable name.Example request
A route is invisible to the caller — you send the same request you would for any model, and to11 matches it to the rule:chat route matches gpt-4o, to11 selects one of its targets and uses the stored credential; otherwise the request falls through to passthrough.
Next steps
Functions
Named task aliases that decouple caller intent from model choice.
Targets
How targets pair a model with a stored credential.
Routing overview
Managed routing, passthrough, and the resolution flow.