Skip to main content
Every request that reaches the to11 gateway is resolved through a three-layer routing model. Each layer is independently useful, and you opt into the higher ones as your project needs them. You configure all of this in the dashboard under Project → AI Gateway — there is nothing to install and no file to edit.

Three-layer routing model

LayerRouting keyWhat it doesWhat to11 owns
L1: PassthroughModel nameProxies the request straight to the provider that serves the modelNothing — your credential reaches the provider
L2: Managed routingModel nameMaps a model to a strategy and a set of targetsThe stored credential and the routing strategy
L3: Function dispatchFunction nameMaps a task name to a route or modelThe credential, the routing, and the task abstraction
L1 is the default. Any model on a connected provider can be called by name straight away. L2 adds a route — a model mapped to a strategy (Direct routing, Weighted split, Fallback chain, or A/B experiment) and a set of targets, each carrying a stored credential — so callers no longer need to supply a provider key. L3 is task-level dispatch: a function resolves a task name to a route or model. Functions are managed by to11; the way you set up task-level routing yourself is a named route, called as route::<name> — see Route by a stable name.

Resolution flow

When a request arrives, the gateway first looks for an explicit namespace prefix. If none is present, it resolves top-down through the three layers.
Request (model field)
  |
  v
Explicit prefix? (function::, route::, provider::)
  +-- Yes --> dispatch to the named layer
  +-- No  --> top-down resolution:
       |
       v
     L3: Function match? --> function dispatch
       |  No
       v
     L2: Route match? --> managed route dispatch
       |  No
       v
     L1: Provider match? --> passthrough
       |  No
       v
     404 (unknown model)
Because resolution is top-down, the same model name can be served both by a connected provider and by a route — the route wins when one is configured. A function name that collides with a model name resolves to the function first.

Namespace prefixes

Explicit prefixes let a caller target a specific routing layer and bypass top-down resolution.
PrefixLayerExampleResolution
function::nameL3function::summarizeDirect function lookup
route::nameL2route::balanced-gpt4oDirect route lookup
provider::modelL1openai::gpt-4oDirect provider lookup (passthrough)
(no prefix)Autogpt-4o or summarizeTop-down resolution
A prefix goes in the model field of the request body:
{ "model": "openai::gpt-4o", "messages": [{ "role": "user", "content": "Hello" }] }
Prefixes make the routing layer explicit and prevent accidental shadowing. Without a prefix, adding a function named summarize would shadow a model with the same name. With a prefix, the caller chooses the layer.

Routing strategies

A route selects among its targets using one of these strategies. In the dashboard’s Routing create-rule flow you pick a strategy and fill in its fields.
StrategyWhat it does
Direct routingSend all traffic to one target.
Weighted splitRandom selection, proportional to each target’s weight. See Weighted split.
Fallback chainTry targets in declared order, failing over on error. See Fallback chain.
A/B experimentSplit traffic across model variants and compare them. See A/B experiment.
You also choose which stored credential each target uses when you add it.

Failover and retry

The fallback strategy tries targets in declared order: if a target fails with a connection error or a 5xx response, the next target is attempted immediately. Within a single attempt, the gateway retries with exponential backoff — by default up to 2 retries with a 500ms base backoff. A route or function can override this default.

Scope and limitations

  • No model remapping by default. The model field is forwarded to the upstream provider as sent, unless a target is explicitly configured to map to a different upstream model.
  • Weighted routing is not health-aware. A weighted target keeps receiving its proportional share of traffic even when it is erroring. Use a Fallback chain when you need automatic failover.
To confirm which provider, route, and target actually served a request, open Projects → Traces in Observe — each request records its routing path, strategy, and resolved target.

Next steps

Passthrough

How L1 passthrough and BYOK requests resolve.

Direct routing

Route a model to a single managed target.

Weighted split

Split traffic across targets by weight.

Fallback chain

Fail over to the next target on error.