Error Handling

The gateway sits between your SDK and the upstream LLM provider. When the upstream returns an error, the gateway translates it so that the response your SDK receives is consistent with the SDK’s expectations — even when the upstream’s native error shape is different from the SDK’s. This page covers what the gateway guarantees about error responses, how cross-provider translation works, and how to inspect the gateway’s classification headers for richer error handling.

Guarantees at a glance

Surface	What you see
`x-to11-error-code` header	Stable cross-provider classification — same value regardless of upstream or SDK.
`x-to11-upstream-provider` header	Provider slug for the upstream that returned the error.
Response body (cross-provider routes)	Translated into the surface SDK’s native error envelope so SDK retry logic triggers.
Response body (same-provider routes)	Forwarded verbatim from the upstream.
Response body (native passthrough surfaces)	Forwarded byte-for-byte.
Response body (5xx errors)	Message redacted to `provider returned status N`.
HTTP status	Verbatim from the upstream.

Headers always classify

Every error response carries two stable headers regardless of which provider, which SDK surface, or which routing strategy produced it:

x-to11-error-code: rate_limit
x-to11-upstream-provider: anthropic

The x-to11-error-code taxonomy is the canonical way to classify gateway errors. Values are stable across all combinations of upstream provider and surface SDK:

`x-to11-error-code`	Meaning
`auth`	Credentials missing, invalid, or revoked (401).
`forbidden`	Authenticated but not permitted (403).
`bad_request`	Malformed request, including context-length and other 4xx conditions.
`quota`	Account-level quota exhausted; retry will not clear.
`rate_limit`	Throttled — retry with backoff will likely succeed.
`overloaded`	Upstream under capacity pressure; retry with backoff.
`content_policy`	Upstream content / safety filter blocked the request or response.
`model_not_found`	Model name did not resolve at the upstream (404).
`org_verification_required`	Gated-model verification flow required.
`upstream`	Generic 5xx upstream condition or network fault (body redacted).
`feature_disabled`	The requested routing surface is disabled for this project (403).

Build retry, alerting, and metrics dashboards on these headers — not on response body text. The header value is the same whether you called the gateway through the OpenAI SDK against an Anthropic upstream, the Anthropic SDK against a Bedrock upstream, or any other cross-provider route.

Bodies on cross-provider routes match the surface SDK

When your SDK’s wire format does not match the upstream provider (for example, OpenAI SDK → Anthropic upstream), the gateway lifts the upstream’s native error envelope into its internal domain model and lowers it into the SDK’s surface envelope. The net effect: your SDK’s built-in retry policy fires correctly on cross-provider errors that previously would have surfaced as a generic gateway-shaped envelope. The classic example is Anthropic’s 529 overloaded_error. Through an OpenAI SDK client, the gateway translates it to OpenAI’s rate_limit_error + rate_limit_exceeded body shape so the official OpenAI SDK’s automatic retry loop activates:

Upstream (Anthropic):
  HTTP/1.1 529
  { "type": "error",
    "error": { "type": "overloaded_error", "message": "Overloaded" } }

Gateway response (OpenAI surface):
  HTTP/1.1 529
  Retry-After: 30
  x-to11-error-code: overloaded
  x-to11-upstream-provider: anthropic
  { "error": { "type": "rate_limit_error",
               "code": "rate_limit_exceeded",
               "message": "Overloaded" } }

This works in both directions and across every supported provider. The classification (x-to11-error-code) stays the same no matter which provider produced the error or which SDK surface you called through.

Retry-After is threaded through

When the upstream response carries a Retry-After header (integer-seconds form), the gateway re-emits it on the translated response. SDKs that honour Retry-After (the official OpenAI and Anthropic SDKs both do) back off accordingly. When the upstream omits the header, the gateway does not synthesise a default — Retry-After is absent on the gateway response too, and the SDK falls back to its own backoff policy.

Same-provider routes remain verbatim

When the surface SDK and the upstream provider match (OpenAI SDK → OpenAI upstream, Anthropic SDK → Anthropic upstream), the gateway forwards the upstream error body verbatim. This preserves any provider-specific signal — header fields, param indicators, request IDs — that the SDK might inspect beyond the standard envelope shape. The classification headers (x-to11-error-code, x-to11-upstream-provider) are still attached. The HTTP status is still verbatim.

Native passthrough remains byte-for-byte

Provider-native passthrough routes forward the upstream response byte-for-byte on both the success and error paths. These give you bit-identical compatibility with a provider’s native API, at the cost of cross-provider translation.

5xx bodies are redacted

For an upstream 5xx, the response body’s message is rewritten to a conservative provider returned status N form. The upstream’s full message text never reaches the client, even if it would have carried useful debugging signal — that signal is preserved on the trace for the request, which you can inspect in the to11 dashboard (Projects → Traces), not in the response body. A generic upstream 5xx is surfaced to the client as 502 Bad Gateway. The one exception is capacity overload, which is surfaced as 529 (the status Anthropic uses) and classified as overloaded. Either way the x-to11-error-code header reaches the client, so your SDK can classify and retry without depending on body text.

Examples

OpenAI SDK retrying through an Anthropic upstream overload

The official OpenAI Python SDK retries automatically on rate_limit_error bodies with Retry-After headers. With the gateway in front of an Anthropic upstream, an Anthropic 529 overloaded_error is translated to OpenAI’s rate_limit_exceeded body shape, so the retry loop just works:

from openai import OpenAI

# Gateway is configured to route this model to an Anthropic upstream.
client = OpenAI(base_url="https://gw.to11.ai/v1")

response = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Hello"}],
    # The SDK will automatically retry on rate_limit_exceeded responses.
)

Behind the scenes the gateway is doing:

Anthropic 529 overloaded_error
  → classified as "overloaded" (retry-after: 30s)
  → OpenAI-shape { error: { type: "rate_limit_error",
                            code: "rate_limit_exceeded",
                            message: "Overloaded" } }
  + Retry-After: 30
  + x-to11-error-code: overloaded
  + x-to11-upstream-provider: anthropic

The OpenAI SDK sees rate_limit_exceeded + Retry-After, sleeps the requested interval, and retries — even though the upstream was Anthropic.

Inspecting `x-to11-error-code` for richer classification

When you need to differentiate quota-exhaustion (retry will not clear) from generic rate-limiting (retry with backoff clears) or from server-side overload, the response body’s error.code is intentionally lossy — it always maps to rate_limit_exceeded on the OpenAI surface so SDK retry behavior works uniformly. The x-to11-error-code header is where the precise classification lives:

import httpx

resp = httpx.post(
    "https://gw.to11.ai/v1/chat/completions",
    json={
        "model": "claude-sonnet-4-6",
        "messages": [{"role": "user", "content": "Hello"}],
    },
    headers={
        "x-to11-authorization": "Bearer <your-to11-api-key>",
        "x-to11-project-id": "<your-project-id>",
    },
)

if resp.status_code >= 400:
    code = resp.headers.get("x-to11-error-code")
    upstream = resp.headers.get("x-to11-upstream-provider")
    if code == "quota":
        # Don't retry — account-level exhaustion.
        raise RuntimeError(f"Quota exhausted on {upstream}")
    elif code in ("rate_limit", "overloaded"):
        # Backoff and retry per Retry-After.
        ...
    elif code == "auth":
        # Credentials are wrong — fail fast.
        ...

Unrecognized envelopes

When an upstream returns an error shape the gateway does not recognize (an empty body, malformed JSON, or a new upstream error kind), it falls back to a generic error body rather than a translated one. The x-to11-error-code header still classifies the error correctly, so retry and alerting logic built on the header keeps working regardless.

Mid-stream SSE error frames

When an upstream returns 200 OK for a streaming request and then emits an error partway through the stream, the gateway delivers that error as an SSE error frame in the shape your SDK expects, and marks the stream aborted so no false success terminator follows:

OpenAI surface — data: {"error":{...}} (the OpenAI SDK parses the top-level error key into the appropriate exception class).
Anthropic surface — event: error followed by data: {"type":"error","error":{...}} (the Anthropic SDK dispatches by SSE event name).

Same-provider and native-passthrough routes forward the upstream’s error frame verbatim; cross-provider routes translate it into the surface SDK’s shape, exactly as on the status-time path. The HTTP status line stays at 200 OK — the error lives in the response body, not on the status line.

API Reference — endpoints, status codes, request/response shapes.
Telemetry — how error classification shows up in spans and metrics.
Streaming — error-handling on the streaming path.

​Guarantees at a glance

​Headers always classify

​Bodies on cross-provider routes match the surface SDK

​Retry-After is threaded through

​Same-provider routes remain verbatim

​Native passthrough remains byte-for-byte

​5xx bodies are redacted

​Examples

​OpenAI SDK retrying through an Anthropic upstream overload

​Inspecting x-to11-error-code for richer classification

​Unrecognized envelopes

​Mid-stream SSE error frames

​Related