Routing Overview

Every request that reaches the to11 gateway is resolved through a three-layer routing model. Each layer is independently useful, and you opt into the higher ones as your project needs them. You configure all of this in the dashboard under Project → AI Gateway — there is nothing to install and no file to edit.

Three-layer routing model

Layer	Routing key	What it does	What to11 owns
L1: Passthrough	Model name	Proxies the request straight to the provider that serves the model	Nothing — your credential reaches the provider
L2: Managed routing	Model name	Maps a model to a strategy and a set of targets	The stored credential and the routing strategy
L3: Function dispatch	Function name	Maps a task name to a route or model	The credential, the routing, and the task abstraction

L1 is the default. Any model on a connected provider can be called by name straight away. L2 adds a route — a model mapped to a strategy (Direct routing, Weighted split, Fallback chain, or A/B experiment) and a set of targets, each carrying a stored credential — so callers no longer need to supply a provider key. L3 is task-level dispatch: a function resolves a task name to a route or model. Functions are managed by to11; the way you set up task-level routing yourself is a named route, called as route::<name> — see Route by a stable name.

Resolution flow

When a request arrives, the gateway first looks for an explicit namespace prefix. If none is present, it resolves top-down through the three layers.

Request (model field)
  |
  v
Explicit prefix? (function::, route::, provider::)
  +-- Yes --> dispatch to the named layer
  +-- No  --> top-down resolution:
       |
       v
     L3: Function match? --> function dispatch
       |  No
       v
     L2: Route match? --> managed route dispatch
       |  No
       v
     L1: Provider match? --> passthrough
       |  No
       v
     404 (unknown model)

Because resolution is top-down, the same model name can be served both by a connected provider and by a route — the route wins when one is configured. A function name that collides with a model name resolves to the function first.

Namespace prefixes

Explicit prefixes let a caller target a specific routing layer and bypass top-down resolution.

Prefix	Layer	Example	Resolution
`function::name`	L3	`function::summarize`	Direct function lookup
`route::name`	L2	`route::balanced-gpt4o`	Direct route lookup
`provider::model`	L1	`openai::gpt-4o`	Direct provider lookup (passthrough)
(no prefix)	Auto	`gpt-4o` or `summarize`	Top-down resolution

A prefix goes in the model field of the request body:

{ "model": "openai::gpt-4o", "messages": [{ "role": "user", "content": "Hello" }] }

Prefixes make the routing layer explicit and prevent accidental shadowing. Without a prefix, adding a function named summarize would shadow a model with the same name. With a prefix, the caller chooses the layer.

Routing strategies

A route selects among its targets using one of these strategies. In the dashboard’s Routing create-rule flow you pick a strategy and fill in its fields.

Strategy	What it does
Direct routing	Send all traffic to one target.
Weighted split	Random selection, proportional to each target’s weight. See Weighted split.
Fallback chain	Try targets in declared order, failing over on error. See Fallback chain.
A/B experiment	Split traffic across model variants and compare them. See A/B experiment.

You also choose which stored credential each target uses when you add it.

Failover and retry

The fallback strategy tries targets in declared order: if a target fails with a connection error or a 5xx response, the next target is attempted immediately. Within a single attempt, the gateway retries with exponential backoff — by default up to 2 retries with a 500ms base backoff. A route or function can override this default.

Scope and limitations

No model remapping by default. The model field is forwarded to the upstream provider as sent, unless a target is explicitly configured to map to a different upstream model.
Weighted routing is not health-aware. A weighted target keeps receiving its proportional share of traffic even when it is erroring. Use a Fallback chain when you need automatic failover.

To confirm which provider, route, and target actually served a request, open Projects → Traces in Observe — each request records its routing path, strategy, and resolved target.

Next steps

Passthrough

How L1 passthrough and BYOK requests resolve.

Direct routing

Route a model to a single managed target.

Weighted split

Split traffic across targets by weight.

Fallback chain

Fail over to the next target on error.

​Three-layer routing model

​Resolution flow

​Namespace prefixes

​Routing strategies

​Failover and retry

​Scope and limitations

​Next steps

Passthrough

Direct routing

Weighted split

Fallback chain

Three-layer routing model

Resolution flow

Namespace prefixes

Routing strategies

Failover and retry

Scope and limitations

Next steps