Skip to main content
An A/B experiment splits traffic for one task across two or more model variants and lets you compare them on live requests. Instead of wiring the split into your application, you define it as a routing rule: each variant names a model and takes a share of traffic, and the gateway picks one per request, proportional to the shares.

Before you start

Connect the providers and models you want to compare — see Connect a provider.

Create an experiment

  1. Go to Project → AI Gateway → Routing and start a new routing rule.
  2. Choose the A/B experiment strategy.
  3. Add a variant for each option you’re testing. Each variant has:
    • a name to identify it (for example, fast and quality),
    • a model — the model that variant uses, and
    • a weight — its share of traffic, as a percentage.
  4. Give the rule a routing identifier and save it.
For example, send half of summarize traffic to a small, cheap model and half to a larger one:
VariantModelWeight
fastgpt-4o-mini50%
qualityclaude-sonnet-4-650%

Send traffic

Call the rule by its name. The gateway picks a variant per request, proportional to the weights:
curl https://gw.to11.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-to11-authorization: Bearer $TO11_API_KEY" \
  -H "x-to11-project-id: $TO11_PROJECT_ID" \
  -d '{
    "model": "route::summarize",
    "messages": [{ "role": "user", "content": "Summarize this quarterly report..." }]
  }'
Your application code doesn’t change as you adjust weights or swap a variant’s model — that’s all in the routing rule.

Compare the variants

Each request is traced. In Observe → Traces you can see which model served each request, along with its latency, token usage, and cost, and compare the variants against each other on live traffic.
Selection is weighted-random per request, so the split is statistical, not exact — over enough requests it converges on the weights you set. Variants compare whole models; they don’t override inference parameters like temperature.

Next steps

Weighted split

Split traffic across providers without the experiment framing.

Observe

Compare variants on live requests.

Routing overview

The full routing model and resolution flow.