Skip to main content
The gateway is reachable at https://gw.to11.ai/v1. It exposes chat ingress endpoints, model discovery, embeddings, token counting, image generation, audio (transcription + TTS), and file management. Each chat endpoint accepts requests in its native SDK format, normalizes them internally, and responds in the caller’s format.
Send x-to11-authorization and x-to11-project-id on every request (see Authentication). For brevity, most examples on this page show only the upstream provider credential — Authorization: Bearer (or x-api-key for Anthropic-style requests) — and omit the two x-to11-* headers. Drop the provider credential entirely when it is managed in the dashboard.

Endpoints

EndpointMethodDescription
/v1/responsesPOSTCanonical chat ingress (OpenAI / xAI / Anthropic shapes)
/v1/chat/completionsPOSTOpenAI / xAI chat completions
/v1/messagesPOSTAnthropic messages API
/v1/modelsGETList all configured models
/v1/models/{model_id}GETGet single model info
/v1/embeddingsPOSTEmbeddings (OpenAI-compatible providers only)
/v1/messages/count_tokensPOSTToken counting (Anthropic only)
/v1/rerankPOSTRerank (Cohere and other rerank-capable providers)
/v1/images/generationsPOSTImage generation (OpenAI-compatible providers only)
/v1/audio/transcriptionsPOSTAudio transcription (OpenAI, multipart/form-data)
/v1/audio/speechPOSTText-to-speech (OpenAI-compatible providers only)
/v1/filesPOSTFile upload (OpenAI-compatible providers only)
/v1/filesGETList files
/v1/files/{file_id}GETRetrieve file metadata
/v1/files/{file_id}/contentGETDownload file content
/v1/files/{file_id}DELETEDelete a file
The model field determines which upstream provider handles the request. In addition to plain model names (e.g. gpt-4o), the model field supports namespace prefixes (separated by ::) for explicit routing:
  • function::name — route to a named function defined in your to11 project’s routing
  • route::name — route to a named route
  • provider::model — route to a specific provider (e.g. anthropic::claude-sonnet-4-6)
Provider-native passthrough routes also exist under /v1/bedrock/*, /v1/vertex/*, and /v1/mistral/* for advanced provider-specific APIs; those forward request bodies to the upstream verbatim and are gated to providers that support them.

Headers

Authentication

Gateway requests use two independent credentials:
  1. Your to11 platform key authenticates the request to to11 and is sent in the x-to11-authorization header. The x-to11-project-id header selects which project’s routing and provider configuration applies.
    x-to11-authorization: Bearer <your-to11-api-key>
    x-to11-project-id: <your-project-id>
    
  2. The upstream provider credential. Either the project’s provider credentials are configured in the dashboard (project → Gateway), in which case you send nothing extra; or you pass your own provider key (BYOK) in the standard Authorization: Bearer <provider-key> header (or x-api-key for Anthropic-style requests). The gateway forwards that key to the upstream provider.
The standard Authorization header carries the upstream provider key, not your to11 key. Put the to11 key in x-to11-authorization. When provider credentials are managed in the dashboard, you can omit Authorization entirely.

SDK detection

The gateway auto-detects the caller’s SDK format from the endpoint. Override with:
x-genai-sdk: openai | anthropic | xai

Provider hints

Force routing to a specific provider:
x-genai-provider: openai | anthropic | xai
Or use a provider::model prefix in the model field:
{ "model": "anthropic::claude-sonnet-4-6", "messages": [...] }
If both header and prefix are present, they must agree or the request is rejected.

Chat completions

OpenAI format

curl https://gw.to11.ai/v1/chat/completions \
  -H "x-to11-authorization: Bearer $TO11_API_KEY" \
  -H "x-to11-project-id: $TO11_PROJECT_ID" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "What is the capital of France?"}],
    "max_tokens": 256
  }'
Response:
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gpt-4o",
  "choices": [{
    "index": 0,
    "message": { "role": "assistant", "content": "Paris." },
    "finish_reason": "stop"
  }],
  "usage": { "prompt_tokens": 14, "completion_tokens": 3, "total_tokens": 17 }
}

Anthropic format

curl https://gw.to11.ai/v1/messages \
  -H "Authorization: Bearer $ANTHROPIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
Response:
{
  "id": "msg-abc123",
  "type": "message",
  "role": "assistant",
  "content": [{ "type": "text", "text": "Hello!" }],
  "model": "claude-sonnet-4-6",
  "stop_reason": "end_turn",
  "usage": { "input_tokens": 10, "output_tokens": 5 }
}

Cross-format routing

Send an OpenAI-format request to an Anthropic model — the gateway translates the request and returns an OpenAI-format response:
curl https://gw.to11.ai/v1/chat/completions \
  -H "Authorization: Bearer $ANTHROPIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
System messages are correctly forwarded: the gateway extracts {"role": "system"} messages and passes them as Anthropic’s top-level system field.

Streaming

Add "stream": true to any request to receive Server-Sent Events:
curl -N https://gw.to11.ai/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Count to 3"}],
    "stream": true
  }'
The gateway streams in the caller’s native SSE format. See Streaming for details on fast-path vs normalized-path behavior.

Structured output

The response_format field is supported for OpenAI models:
{
  "model": "gpt-4o",
  "messages": [{"role": "user", "content": "List 3 colors as JSON"}],
  "response_format": { "type": "json_object" }
}
Supported types: text, json_object, json_schema. Anthropic models do not support json_object or json_schema — requests with these formats routed to Anthropic return 400 Bad Request.

Tool calls

The tools field is forwarded to the upstream provider. For Anthropic models, tool definitions are automatically translated to the input_schema format:
{
  "model": "gpt-4o",
  "messages": [{"role": "user", "content": "What's the weather in Paris?"}],
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get current weather for a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": { "type": "string" }
        },
        "required": ["location"]
      }
    }
  }]
}

Models

List all configured models or retrieve a single model. The list is built from your project’s gateway configuration — no upstream provider call is made.
# List all models
curl https://gw.to11.ai/v1/models \
  -H "x-to11-authorization: Bearer $TO11_API_KEY" \
  -H "x-to11-project-id: $TO11_PROJECT_ID"
{
  "object": "list",
  "data": [
    { "id": "gpt-4o", "object": "model", "created": 0, "owned_by": "openai" },
    { "id": "claude-sonnet-4-6", "object": "model", "created": 0, "owned_by": "anthropic" }
  ]
}
# Get a single model
curl https://gw.to11.ai/v1/models/gpt-4o

Embeddings

Passthrough proxy to OpenAI-compatible providers. The request is routed by model name. Anthropic models return 400 Bad Request.
curl -X POST https://gw.to11.ai/v1/embeddings \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "text-embedding-3-small", "input": "hello world"}'
Input text is scanned by the security pipeline (PII + blocklist) when security is enabled. Embedding model names must be listed in the provider’s models config array.

Token counting

Passthrough proxy to Anthropic’s /messages/count_tokens endpoint. Only Anthropic models are supported — OpenAI models return 400 Bad Request.
curl -X POST https://gw.to11.ai/v1/messages/count_tokens \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "How many tokens is this?"}]
  }'
Response: {"input_tokens": 14}

Image generation

Passthrough proxy to OpenAI-compatible providers’ DALL-E endpoint. Anthropic models return 400 Bad Request.
curl -X POST https://gw.to11.ai/v1/images/generations \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "dall-e-3", "prompt": "a white cat", "n": 1, "size": "1024x1024"}'
The prompt field is scanned by the security pipeline when enabled. Image model names must be listed in the provider’s models config array.

Audio transcription

Passthrough proxy to OpenAI’s Whisper endpoint. The request must be multipart/form-data containing an audio file. No security scanning is applied (binary audio input).
curl -X POST https://gw.to11.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -F file=@recording.mp3 \
  -F model=whisper-1

Audio speech (TTS)

Passthrough proxy to OpenAI’s text-to-speech endpoint. Anthropic models return 400 Bad Request.
curl -X POST https://gw.to11.ai/v1/audio/speech \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "tts-1", "input": "Hello world", "voice": "alloy"}' \
  --output speech.mp3
The input field is scanned by the security pipeline when enabled.

Files

Passthrough proxy to OpenAI’s file management endpoints. Supports upload (multipart), list, retrieve, download, and delete. Anthropic models return 400 Bad Request. No security scanning is applied.
# Upload a file
curl -X POST https://gw.to11.ai/v1/files \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -F purpose=assistants \
  -F file=@data.jsonl

# List files
curl https://gw.to11.ai/v1/files \
  -H "Authorization: Bearer $OPENAI_API_KEY"

# Retrieve file metadata
curl https://gw.to11.ai/v1/files/file-abc123 \
  -H "Authorization: Bearer $OPENAI_API_KEY"

# Download file content
curl https://gw.to11.ai/v1/files/file-abc123/content \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  --output downloaded.jsonl

# Delete a file
curl -X DELETE https://gw.to11.ai/v1/files/file-abc123 \
  -H "Authorization: Bearer $OPENAI_API_KEY"

Response headers

The gateway adds these headers so clients can classify and correlate responses without parsing the body:
HeaderWhenValue
x-to11-error-codeEvery error responseStable to11 error classification (e.g. rate_limit, auth, model_not_found)
x-to11-upstream-providerErrors that reached an upstreamProvider slug that produced the error (e.g. openai, anthropic)
Retry-AfterUpstream 429 that carried itSeconds to wait before retrying
x-to11-cacheWhen response caching is enabledhit or miss
x-to11-cache-ageOn a cache hitAge of the cached entry, in seconds

Error responses

ScenarioStatus
Unknown model404 Not Found
Input guardrail violation (PII or blocklist)400 Bad Request
Output guardrail violation (non-streaming)422 Unprocessable Entity
Ambiguous model name409 Conflict
Unsupported response_format for provider400 Bad Request
Upstream provider 4xxSame status, passed through (e.g. 401, 404, 429)
Upstream provider 5xx502 Bad Gateway (message redacted)
Errors are returned as a JSON object with a top-level error key, in the caller’s SDK shape:
{ "error": { "type": "...", "code": "...", "message": "<description>" } }
The stable classification lives in the x-to11-error-code header rather than the body text. See Error handling for the full error-code taxonomy, cross-provider translation, and retry guidance.