Skip to main content
A route maps a model name to a set of targets under a selection strategy. When a request arrives for a model a route matches, to11 picks one of the route’s targets according to the strategy and proxies the request using the stored credential — managed routing, rather than the caller’s own key. You create and edit routes in the dashboard, under Project → AI Gateway → Routing.

What a route matches on

A route matches on two things together:
  • A model name — the value of the model field in the incoming request.
  • An endpoint kindchat, embeddings, audio_speech, audio_transcription, or image_generation.
Both have to match. A chat route for gpt-4o does not catch an embeddings request for the same model name.

Strategies

When you create a routing rule, you choose how to11 picks among its targets. Routes offer four strategies:

Direct routing

A single target. Every matching request goes to the same model-and-credential pairing. This is what you get when a rule has exactly one target.

Weighted split

Two or more targets, chosen at random in proportion to their weights. Useful for a gradual migration (shift a slice of traffic to a new provider) or cost shaping (send most traffic to a cheaper endpoint). In the dashboard you enter each target’s share as a percentage — 70 and 30 send roughly seven requests to the first target for every three to the second.
Weighted split chooses purely by weight; it does not track the health of a target between requests. If you want failover when a target is unreachable, use a Fallback chain.

Fallback chain

Two or more targets tried in the order you declare them. to11 sends to the first target; on a connection error or a 5xx response it moves to the next, and so on down the chain.

A/B experiment

Two or more model variants, each taking a share of traffic, so you can compare them on live requests. See A/B experiment.

Where routes are built

You build routes in the dashboard’s AI Gateway → Routing section, where a rule pairs a model name and endpoint kind with a strategy and its targets. For step-by-step walkthroughs, see the routing how-tos: Direct routing, Weighted split, Fallback chain, and A/B experiment.

Retries

When a request fails in a way to11 can retry, the gateway retries internally — by default up to 2 retries with a 500 ms base backoff that grows on each attempt. Retries happen inside the gateway; you do not configure them per request.

How routes fit into routing

A route is the managed-routing layer (L2). to11 resolves an incoming request top-down:
  • Function (L3) — a named task (function) is matched first.
  • Route (L2) — then a route, using a stored credential.
  • Provider (L1) — then a direct provider lookup, passthrough with the caller’s own key.
A model can appear at more than one layer; the highest-priority match wins. A model no layer matches is unknown to the gateway and returns a 404.
To call a route by a stable name and swap the model behind it without changing your application, give the route a routing identifier and invoke it as route::<name>. See Route by a stable name.

Example request

A route is invisible to the caller — you send the same request you would for any model, and to11 matches it to the rule:
curl https://gw.to11.ai/v1/chat/completions \
  -H "x-to11-authorization: Bearer $TO11_API_KEY" \
  -H "x-to11-project-id: $TO11_PROJECT_ID" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'
If a chat route matches gpt-4o, to11 selects one of its targets and uses the stored credential; otherwise the request falls through to passthrough.

Next steps

Functions

Named task aliases that decouple caller intent from model choice.

Targets

How targets pair a model with a stored credential.

Routing overview

Managed routing, passthrough, and the resolution flow.