The gateway is reachable at https://gw.to11.ai/v1. It exposes chat ingress endpoints, model discovery, embeddings, token counting, image generation, audio (transcription + TTS), and file management. Each chat endpoint accepts requests in its native SDK format, normalizes them internally, and responds in the caller’s format.
Send x-to11-authorization and x-to11-project-id on every request (see Authentication). For brevity, most examples on this page show only the upstream provider credential — Authorization: Bearer (or x-api-key for Anthropic-style requests) — and omit the two x-to11-* headers. Drop the provider credential entirely when it is managed in the dashboard.
Endpoints
| Endpoint | Method | Description |
|---|
/v1/responses | POST | Canonical chat ingress (OpenAI / xAI / Anthropic shapes) |
/v1/chat/completions | POST | OpenAI / xAI chat completions |
/v1/messages | POST | Anthropic messages API |
/v1/models | GET | List all configured models |
/v1/models/{model_id} | GET | Get single model info |
/v1/embeddings | POST | Embeddings (OpenAI-compatible providers only) |
/v1/messages/count_tokens | POST | Token counting (Anthropic only) |
/v1/rerank | POST | Rerank (Cohere and other rerank-capable providers) |
/v1/images/generations | POST | Image generation (OpenAI-compatible providers only) |
/v1/audio/transcriptions | POST | Audio transcription (OpenAI, multipart/form-data) |
/v1/audio/speech | POST | Text-to-speech (OpenAI-compatible providers only) |
/v1/files | POST | File upload (OpenAI-compatible providers only) |
/v1/files | GET | List files |
/v1/files/{file_id} | GET | Retrieve file metadata |
/v1/files/{file_id}/content | GET | Download file content |
/v1/files/{file_id} | DELETE | Delete a file |
The model field determines which upstream provider handles the request. In addition to plain model names (e.g. gpt-4o), the model field supports namespace prefixes (separated by ::) for explicit routing:
function::name — route to a named function defined in your to11 project’s routing
route::name — route to a named route
provider::model — route to a specific provider (e.g. anthropic::claude-sonnet-4-6)
Provider-native passthrough routes also exist under /v1/bedrock/*, /v1/vertex/*, and /v1/mistral/* for advanced provider-specific APIs; those forward request bodies to the upstream verbatim and are gated to providers that support them.
Authentication
Gateway requests use two independent credentials:
-
Your to11 platform key authenticates the request to to11 and is sent in the
x-to11-authorization header. The x-to11-project-id header selects which project’s routing and provider configuration applies.
x-to11-authorization: Bearer <your-to11-api-key>
x-to11-project-id: <your-project-id>
-
The upstream provider credential. Either the project’s provider credentials are configured in the dashboard (project → Gateway), in which case you send nothing extra; or you pass your own provider key (BYOK) in the standard
Authorization: Bearer <provider-key> header (or x-api-key for Anthropic-style requests). The gateway forwards that key to the upstream provider.
The standard Authorization header carries the upstream provider key, not your to11 key. Put the to11 key in x-to11-authorization. When provider credentials are managed in the dashboard, you can omit Authorization entirely.
SDK detection
The gateway auto-detects the caller’s SDK format from the endpoint. Override with:
x-genai-sdk: openai | anthropic | xai
Provider hints
Force routing to a specific provider:
x-genai-provider: openai | anthropic | xai
Or use a provider::model prefix in the model field:
{ "model": "anthropic::claude-sonnet-4-6", "messages": [...] }
If both header and prefix are present, they must agree or the request is rejected.
Chat completions
curl https://gw.to11.ai/v1/chat/completions \
-H "x-to11-authorization: Bearer $TO11_API_KEY" \
-H "x-to11-project-id: $TO11_PROJECT_ID" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "What is the capital of France?"}],
"max_tokens": 256
}'
Response:
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"model": "gpt-4o",
"choices": [{
"index": 0,
"message": { "role": "assistant", "content": "Paris." },
"finish_reason": "stop"
}],
"usage": { "prompt_tokens": 14, "completion_tokens": 3, "total_tokens": 17 }
}
curl https://gw.to11.ai/v1/messages \
-H "Authorization: Bearer $ANTHROPIC_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 256,
"messages": [{"role": "user", "content": "Hello!"}]
}'
Response:
{
"id": "msg-abc123",
"type": "message",
"role": "assistant",
"content": [{ "type": "text", "text": "Hello!" }],
"model": "claude-sonnet-4-6",
"stop_reason": "end_turn",
"usage": { "input_tokens": 10, "output_tokens": 5 }
}
Send an OpenAI-format request to an Anthropic model — the gateway translates the request and returns an OpenAI-format response:
curl https://gw.to11.ai/v1/chat/completions \
-H "Authorization: Bearer $ANTHROPIC_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"messages": [{"role": "user", "content": "Hello!"}]
}'
System messages are correctly forwarded: the gateway extracts {"role": "system"} messages and passes them as Anthropic’s top-level system field.
Streaming
Add "stream": true to any request to receive Server-Sent Events:
curl -N https://gw.to11.ai/v1/chat/completions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Count to 3"}],
"stream": true
}'
The gateway streams in the caller’s native SSE format. See Streaming for details on fast-path vs normalized-path behavior.
Structured output
The response_format field is supported for OpenAI models:
{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "List 3 colors as JSON"}],
"response_format": { "type": "json_object" }
}
Supported types: text, json_object, json_schema. Anthropic models do not support json_object or json_schema — requests with these formats routed to Anthropic return 400 Bad Request.
The tools field is forwarded to the upstream provider. For Anthropic models, tool definitions are automatically translated to the input_schema format:
{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "What's the weather in Paris?"}],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string" }
},
"required": ["location"]
}
}
}]
}
Models
List all configured models or retrieve a single model. The list is built from your project’s gateway configuration — no upstream provider call is made.
# List all models
curl https://gw.to11.ai/v1/models \
-H "x-to11-authorization: Bearer $TO11_API_KEY" \
-H "x-to11-project-id: $TO11_PROJECT_ID"
{
"object": "list",
"data": [
{ "id": "gpt-4o", "object": "model", "created": 0, "owned_by": "openai" },
{ "id": "claude-sonnet-4-6", "object": "model", "created": 0, "owned_by": "anthropic" }
]
}
# Get a single model
curl https://gw.to11.ai/v1/models/gpt-4o
Embeddings
Passthrough proxy to OpenAI-compatible providers. The request is routed by model name. Anthropic models return 400 Bad Request.
curl -X POST https://gw.to11.ai/v1/embeddings \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "text-embedding-3-small", "input": "hello world"}'
Input text is scanned by the security pipeline (PII + blocklist) when security is enabled. Embedding model names must be listed in the provider’s models config array.
Token counting
Passthrough proxy to Anthropic’s /messages/count_tokens endpoint. Only Anthropic models are supported — OpenAI models return 400 Bad Request.
curl -X POST https://gw.to11.ai/v1/messages/count_tokens \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"messages": [{"role": "user", "content": "How many tokens is this?"}]
}'
Response: {"input_tokens": 14}
Image generation
Passthrough proxy to OpenAI-compatible providers’ DALL-E endpoint. Anthropic models return 400 Bad Request.
curl -X POST https://gw.to11.ai/v1/images/generations \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "dall-e-3", "prompt": "a white cat", "n": 1, "size": "1024x1024"}'
The prompt field is scanned by the security pipeline when enabled. Image model names must be listed in the provider’s models config array.
Audio transcription
Passthrough proxy to OpenAI’s Whisper endpoint. The request must be multipart/form-data containing an audio file. No security scanning is applied (binary audio input).
curl -X POST https://gw.to11.ai/v1/audio/transcriptions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-F file=@recording.mp3 \
-F model=whisper-1
Audio speech (TTS)
Passthrough proxy to OpenAI’s text-to-speech endpoint. Anthropic models return 400 Bad Request.
curl -X POST https://gw.to11.ai/v1/audio/speech \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "tts-1", "input": "Hello world", "voice": "alloy"}' \
--output speech.mp3
The input field is scanned by the security pipeline when enabled.
Files
Passthrough proxy to OpenAI’s file management endpoints. Supports upload (multipart), list, retrieve, download, and delete. Anthropic models return 400 Bad Request. No security scanning is applied.
# Upload a file
curl -X POST https://gw.to11.ai/v1/files \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-F purpose=assistants \
-F file=@data.jsonl
# List files
curl https://gw.to11.ai/v1/files \
-H "Authorization: Bearer $OPENAI_API_KEY"
# Retrieve file metadata
curl https://gw.to11.ai/v1/files/file-abc123 \
-H "Authorization: Bearer $OPENAI_API_KEY"
# Download file content
curl https://gw.to11.ai/v1/files/file-abc123/content \
-H "Authorization: Bearer $OPENAI_API_KEY" \
--output downloaded.jsonl
# Delete a file
curl -X DELETE https://gw.to11.ai/v1/files/file-abc123 \
-H "Authorization: Bearer $OPENAI_API_KEY"
The gateway adds these headers so clients can classify and correlate responses without parsing the body:
| Header | When | Value |
|---|
x-to11-error-code | Every error response | Stable to11 error classification (e.g. rate_limit, auth, model_not_found) |
x-to11-upstream-provider | Errors that reached an upstream | Provider slug that produced the error (e.g. openai, anthropic) |
Retry-After | Upstream 429 that carried it | Seconds to wait before retrying |
x-to11-cache | When response caching is enabled | hit or miss |
x-to11-cache-age | On a cache hit | Age of the cached entry, in seconds |
Error responses
| Scenario | Status |
|---|
| Unknown model | 404 Not Found |
| Input guardrail violation (PII or blocklist) | 400 Bad Request |
| Output guardrail violation (non-streaming) | 422 Unprocessable Entity |
| Ambiguous model name | 409 Conflict |
Unsupported response_format for provider | 400 Bad Request |
| Upstream provider 4xx | Same status, passed through (e.g. 401, 404, 429) |
| Upstream provider 5xx | 502 Bad Gateway (message redacted) |
Errors are returned as a JSON object with a top-level error key, in the caller’s SDK shape:
{ "error": { "type": "...", "code": "...", "message": "<description>" } }
The stable classification lives in the x-to11-error-code header rather than the body text. See Error handling for the full error-code taxonomy, cross-provider translation, and retry guidance.