Deploy - to11

The to11 gateway is a managed LLM proxy. It sits between your application and your model providers with minimal added overhead, routing requests, applying inline security, translating between provider formats, and emitting OpenTelemetry traces. You point your SDK at it — there’s nothing to deploy or run yourself. You connect providers and define routing in the dashboard’s AI Gateway section.

Architecture

Your app / SDK
    |  OpenAI, Anthropic, or xAI format
    v
+------------------------------------------+
|  to11 Gateway (managed)                  |
|                                          |
|  - Routing (passthrough, managed)        |
|  - Cross-format detection & translation  |
|  - Inline security pipeline              |
|  - Streaming (fast path + normalized)    |
|  - GenAI telemetry + trace propagation   |
+--------------------+---------------------+
                     |
                     v
        Your LLM providers (OpenAI, Anthropic, xAI, ...)

Routing

The gateway resolves each request top-down: a function match wins first, then a named route match, then a direct provider match; otherwise the request passes straight through. Functions are managed by to11, so the two modes you set up yourself are passthrough and named routes. You opt into managed routing as you need it.

Mode	How you call it	What the gateway does
Passthrough	A bare model name, or `provider::model`	Proxies straight through to a provider
Managed routing	A named route, called as `route::<name>`	Applies a routing strategy — direct, weighted split, fallback chain, or A/B experiment — across your connected providers

See the Routing overview for the full resolution flow, and Concepts for providers, models, targets, and routes.

Key capabilities

Cross-format routing

The gateway accepts requests in OpenAI, Anthropic, or xAI format and routes them to any connected provider. The response is returned in the caller’s native format — so you can send an OpenAI-format request, route it to Anthropic’s Claude, and get an OpenAI-format response back without changing your application.

Streaming

When the caller’s format matches the upstream provider, streamed chunks are forwarded with zero-copy passthrough. When the formats differ, the stream is normalized and re-serialized into the caller’s format.

Inline security

A security pipeline runs in the request path before anything is forwarded upstream: blocklist matching and PII detection today, with ML-based prompt-injection and content moderation coming. Blocked requests are rejected and never reach the provider.

Telemetry

Every call produces an OpenTelemetry GenAI span — model, tokens, latency, finish reason, time-to-first-token for streams, and optional prompt and completion content. See Instrument and Observe.

Next steps

Routing

Passthrough, direct, weighted, fallback, and A/B experiments.

Concepts

Providers, models, targets, and routes.

Environments

Separate production from staging.

Observe

See what each request did in production.

​Architecture

​Routing

​Key capabilities

​Cross-format routing

​Streaming

​Inline security

​Telemetry

​Next steps

Routing

Concepts

Environments

Observe

Architecture

Routing

Key capabilities

Cross-format routing

Streaming

Inline security

Telemetry

Next steps