Before you start
Connect the providers and models you want to compare — see Connect a provider.Create an experiment
- Go to Project → AI Gateway → Routing and start a new routing rule.
- Choose the A/B experiment strategy.
- Add a variant for each option you’re testing. Each variant has:
- a name to identify it (for example,
fastandquality), - a model — the model that variant uses, and
- a weight — its share of traffic, as a percentage.
- a name to identify it (for example,
- Give the rule a routing identifier and save it.
summarize traffic to a small, cheap model and half to a larger one:
| Variant | Model | Weight |
|---|---|---|
fast | gpt-4o-mini | 50% |
quality | claude-sonnet-4-6 | 50% |
Send traffic
Call the rule by its name. The gateway picks a variant per request, proportional to the weights:Compare the variants
Each request is traced. In Observe → Traces you can see which model served each request, along with its latency, token usage, and cost, and compare the variants against each other on live traffic.Selection is weighted-random per request, so the split is statistical, not exact — over enough requests it converges on the weights you set. Variants compare whole models; they don’t override inference parameters like
temperature.Next steps
Weighted split
Split traffic across providers without the experiment framing.
Observe
Compare variants on live requests.
Routing overview
The full routing model and resolution flow.