API Reference

The gateway is reachable at https://gw.to11.ai/v1. It exposes chat ingress endpoints, model discovery, embeddings, token counting, image generation, audio (transcription + TTS), and file management. Each chat endpoint accepts requests in its native SDK format, normalizes them internally, and responds in the caller’s format.

Send x-to11-authorization and x-to11-project-id on every request (see Authentication). For brevity, most examples on this page show only the upstream provider credential — Authorization: Bearer (or x-api-key for Anthropic-style requests) — and omit the two x-to11-* headers. Drop the provider credential entirely when it is managed in the dashboard.

Endpoints

Endpoint	Method	Description
`/v1/responses`	POST	Canonical chat ingress (OpenAI / xAI / Anthropic shapes)
`/v1/chat/completions`	POST	OpenAI / xAI chat completions
`/v1/messages`	POST	Anthropic messages API
`/v1/models`	GET	List all configured models
`/v1/models/{model_id}`	GET	Get single model info
`/v1/embeddings`	POST	Embeddings (OpenAI-compatible providers only)
`/v1/messages/count_tokens`	POST	Token counting (Anthropic only)
`/v1/rerank`	POST	Rerank (Cohere and other rerank-capable providers)
`/v1/images/generations`	POST	Image generation (OpenAI-compatible providers only)
`/v1/audio/transcriptions`	POST	Audio transcription (OpenAI, multipart/form-data)
`/v1/audio/speech`	POST	Text-to-speech (OpenAI-compatible providers only)
`/v1/files`	POST	File upload (OpenAI-compatible providers only)
`/v1/files`	GET	List files
`/v1/files/{file_id}`	GET	Retrieve file metadata
`/v1/files/{file_id}/content`	GET	Download file content
`/v1/files/{file_id}`	DELETE	Delete a file

The model field determines which upstream provider handles the request. In addition to plain model names (e.g. gpt-4o), the model field supports namespace prefixes (separated by ::) for explicit routing:

function::name — route to a named function defined in your to11 project’s routing
route::name — route to a named route
provider::model — route to a specific provider (e.g. anthropic::claude-sonnet-4-6)

Provider-native passthrough routes also exist under /v1/bedrock/*, /v1/vertex/*, and /v1/mistral/* for advanced provider-specific APIs; those forward request bodies to the upstream verbatim and are gated to providers that support them.

Headers

Authentication

Gateway requests use two independent credentials:

Your to11 platform key authenticates the request to to11 and is sent in the x-to11-authorization header. The x-to11-project-id header selects which project’s routing and provider configuration applies.
```
x-to11-authorization: Bearer <your-to11-api-key>
x-to11-project-id: <your-project-id>
```
The upstream provider credential. Either the project’s provider credentials are configured in the dashboard (project → Gateway), in which case you send nothing extra; or you pass your own provider key (BYOK) in the standard Authorization: Bearer <provider-key> header (or x-api-key for Anthropic-style requests). The gateway forwards that key to the upstream provider.

The standard Authorization header carries the upstream provider key, not your to11 key. Put the to11 key in x-to11-authorization. When provider credentials are managed in the dashboard, you can omit Authorization entirely.

SDK detection

The gateway auto-detects the caller’s SDK format from the endpoint. Override with:

x-genai-sdk: openai | anthropic | xai

Provider hints

Force routing to a specific provider:

x-genai-provider: openai | anthropic | xai

Or use a provider::model prefix in the model field:

{ "model": "anthropic::claude-sonnet-4-6", "messages": [...] }

If both header and prefix are present, they must agree or the request is rejected.

Chat completions

OpenAI format

curl https://gw.to11.ai/v1/chat/completions \
  -H "x-to11-authorization: Bearer $TO11_API_KEY" \
  -H "x-to11-project-id: $TO11_PROJECT_ID" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "What is the capital of France?"}],
    "max_tokens": 256
  }'

Response:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gpt-4o",
  "choices": [{
    "index": 0,
    "message": { "role": "assistant", "content": "Paris." },
    "finish_reason": "stop"
  }],
  "usage": { "prompt_tokens": 14, "completion_tokens": 3, "total_tokens": 17 }
}

Anthropic format

curl https://gw.to11.ai/v1/messages \
  -H "Authorization: Bearer $ANTHROPIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Response:

{
  "id": "msg-abc123",
  "type": "message",
  "role": "assistant",
  "content": [{ "type": "text", "text": "Hello!" }],
  "model": "claude-sonnet-4-6",
  "stop_reason": "end_turn",
  "usage": { "input_tokens": 10, "output_tokens": 5 }
}

Cross-format routing

Send an OpenAI-format request to an Anthropic model — the gateway translates the request and returns an OpenAI-format response:

curl https://gw.to11.ai/v1/chat/completions \
  -H "Authorization: Bearer $ANTHROPIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

System messages are correctly forwarded: the gateway extracts {"role": "system"} messages and passes them as Anthropic’s top-level system field.

Streaming

Add "stream": true to any request to receive Server-Sent Events:

curl -N https://gw.to11.ai/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Count to 3"}],
    "stream": true
  }'

The gateway streams in the caller’s native SSE format. See Streaming for details on fast-path vs normalized-path behavior.

Structured output

The response_format field is supported for OpenAI models:

{
  "model": "gpt-4o",
  "messages": [{"role": "user", "content": "List 3 colors as JSON"}],
  "response_format": { "type": "json_object" }
}

Supported types: text, json_object, json_schema. Anthropic models do not support json_object or json_schema — requests with these formats routed to Anthropic return 400 Bad Request.

Tool calls

The tools field is forwarded to the upstream provider. For Anthropic models, tool definitions are automatically translated to the input_schema format:

{
  "model": "gpt-4o",
  "messages": [{"role": "user", "content": "What's the weather in Paris?"}],
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get current weather for a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": { "type": "string" }
        },
        "required": ["location"]
      }
    }
  }]
}

Models

List all configured models or retrieve a single model. The list is built from your project’s gateway configuration — no upstream provider call is made.

# List all models
curl https://gw.to11.ai/v1/models \
  -H "x-to11-authorization: Bearer $TO11_API_KEY" \
  -H "x-to11-project-id: $TO11_PROJECT_ID"

{
  "object": "list",
  "data": [
    { "id": "gpt-4o", "object": "model", "created": 0, "owned_by": "openai" },
    { "id": "claude-sonnet-4-6", "object": "model", "created": 0, "owned_by": "anthropic" }
  ]
}

# Get a single model
curl https://gw.to11.ai/v1/models/gpt-4o

Embeddings

Passthrough proxy to OpenAI-compatible providers. The request is routed by model name. Anthropic models return 400 Bad Request.

curl -X POST https://gw.to11.ai/v1/embeddings \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "text-embedding-3-small", "input": "hello world"}'

Input text is scanned by the security pipeline (PII + blocklist) when security is enabled. Embedding model names must be listed in the provider’s models config array.

Token counting

Passthrough proxy to Anthropic’s /messages/count_tokens endpoint. Only Anthropic models are supported — OpenAI models return 400 Bad Request.

curl -X POST https://gw.to11.ai/v1/messages/count_tokens \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "How many tokens is this?"}]
  }'

Response: {"input_tokens": 14}

Image generation

Passthrough proxy to OpenAI-compatible providers’ DALL-E endpoint. Anthropic models return 400 Bad Request.

curl -X POST https://gw.to11.ai/v1/images/generations \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "dall-e-3", "prompt": "a white cat", "n": 1, "size": "1024x1024"}'

The prompt field is scanned by the security pipeline when enabled. Image model names must be listed in the provider’s models config array.

Audio transcription

Passthrough proxy to OpenAI’s Whisper endpoint. The request must be multipart/form-data containing an audio file. No security scanning is applied (binary audio input).

curl -X POST https://gw.to11.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -F file=@recording.mp3 \
  -F model=whisper-1

Audio speech (TTS)

Passthrough proxy to OpenAI’s text-to-speech endpoint. Anthropic models return 400 Bad Request.

curl -X POST https://gw.to11.ai/v1/audio/speech \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "tts-1", "input": "Hello world", "voice": "alloy"}' \
  --output speech.mp3

The input field is scanned by the security pipeline when enabled.

Files

Passthrough proxy to OpenAI’s file management endpoints. Supports upload (multipart), list, retrieve, download, and delete. Anthropic models return 400 Bad Request. No security scanning is applied.

# Upload a file
curl -X POST https://gw.to11.ai/v1/files \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -F purpose=assistants \
  -F file=@data.jsonl

# List files
curl https://gw.to11.ai/v1/files \
  -H "Authorization: Bearer $OPENAI_API_KEY"

# Retrieve file metadata
curl https://gw.to11.ai/v1/files/file-abc123 \
  -H "Authorization: Bearer $OPENAI_API_KEY"

# Download file content
curl https://gw.to11.ai/v1/files/file-abc123/content \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  --output downloaded.jsonl

# Delete a file
curl -X DELETE https://gw.to11.ai/v1/files/file-abc123 \
  -H "Authorization: Bearer $OPENAI_API_KEY"

Response headers

The gateway adds these headers so clients can classify and correlate responses without parsing the body:

Header	When	Value
`x-to11-error-code`	Every error response	Stable to11 error classification (e.g. `rate_limit`, `auth`, `model_not_found`)
`x-to11-upstream-provider`	Errors that reached an upstream	Provider slug that produced the error (e.g. `openai`, `anthropic`)
`Retry-After`	Upstream 429 that carried it	Seconds to wait before retrying
`x-to11-cache`	When response caching is enabled	`hit` or `miss`
`x-to11-cache-age`	On a cache hit	Age of the cached entry, in seconds

Error responses

Scenario	Status
Unknown model	`404 Not Found`
Input guardrail violation (PII or blocklist)	`400 Bad Request`
Output guardrail violation (non-streaming)	`422 Unprocessable Entity`
Ambiguous model name	`409 Conflict`
Unsupported `response_format` for provider	`400 Bad Request`
Upstream provider 4xx	Same status, passed through (e.g. `401`, `404`, `429`)
Upstream provider 5xx	`502 Bad Gateway` (message redacted)

Errors are returned as a JSON object with a top-level error key, in the caller’s SDK shape:

{ "error": { "type": "...", "code": "...", "message": "<description>" } }

The stable classification lives in the x-to11-error-code header rather than the body text. See Error handling for the full error-code taxonomy, cross-provider translation, and retry guidance.

​Endpoints

​Headers

​Authentication

​SDK detection

​Provider hints

​Chat completions

​OpenAI format

​Anthropic format

​Cross-format routing

​Streaming

​Structured output

​Tool calls

​Models

​Embeddings

​Token counting

​Image generation

​Audio transcription

​Audio speech (TTS)

​Files

​Response headers

​Error responses

Endpoints

Headers

Authentication

SDK detection

Provider hints

Chat completions

OpenAI format

Anthropic format

Cross-format routing

Streaming

Structured output

Tool calls

Models

Embeddings

Token counting

Image generation

Audio transcription

Audio speech (TTS)

Files

Response headers

Error responses