Skip to main content

Documentation Index

Fetch the complete documentation index at: https://to11.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

API Reference

The gateway exposes chat ingress endpoints, model discovery, embeddings, token counting, image generation, audio (transcription + TTS), and file management. Each chat endpoint accepts requests in its native SDK format, normalises them internally, and responds in the caller’s format.

Health check

GET /health  ->  200 OK

Endpoints

EndpointMethodDescription
/v1/chat/completionsPOSTOpenAI / xAI chat completions
/v1/messagesPOSTAnthropic messages API
/v1/responsesPOSTCanonical gateway endpoint (OpenAI / xAI / Anthropic)
/v1/modelsGETList all configured models
/v1/models/{model_id}GETGet single model info
/v1/embeddingsPOSTEmbeddings (OpenAI-compatible providers only)
/v1/messages/count_tokensPOSTToken counting (Anthropic only)
/v1/images/generationsPOSTImage generation (OpenAI-compatible providers only)
/v1/audio/transcriptionsPOSTAudio transcription (OpenAI, multipart/form-data)
/v1/audio/speechPOSTText-to-speech (OpenAI-compatible providers only)
/v1/filesPOSTFile upload (OpenAI-compatible providers only)
/v1/filesGETList files
/v1/files/{file_id}GETRetrieve file metadata
/v1/files/{file_id}/contentGETDownload file content
/v1/files/{file_id}DELETEDelete a file
The model field determines which upstream provider handles the request. In addition to plain model names (e.g. gpt-4o), the model field supports namespace prefixes for explicit routing:
  • function::name — route to a named function defined in your gateway config
  • route::name — route to a named route
  • provider::model — route to a specific provider (e.g. anthropic::claude-sonnet-4-6)

Headers

Authorization

The gateway passes your API key through to the upstream provider. Use the Authorization header as you normally would with the provider:
Authorization: Bearer sk-...

SDK detection

The gateway auto-detects the caller’s SDK format from the endpoint. Override with:
x-genai-sdk: openai | anthropic | xai

Provider hints

Force routing to a specific provider:
x-genai-provider: openai | anthropic | xai
Or use a provider::model prefix in the model field:
{ "model": "anthropic::claude-sonnet-4-6", "messages": [...] }
If both header and prefix are present, they must agree or the request is rejected.

Chat completions

OpenAI format

curl http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "What is the capital of France?"}],
    "max_tokens": 256
  }'
Response:
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gpt-4o",
  "choices": [{
    "index": 0,
    "message": { "role": "assistant", "content": "Paris." },
    "finish_reason": "stop"
  }],
  "usage": { "prompt_tokens": 14, "completion_tokens": 3, "total_tokens": 17 }
}

Anthropic format

curl http://localhost:4000/v1/messages \
  -H "Authorization: Bearer $ANTHROPIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
Response:
{
  "id": "msg-abc123",
  "type": "message",
  "role": "assistant",
  "content": [{ "type": "text", "text": "Hello!" }],
  "model": "claude-sonnet-4-6",
  "stop_reason": "end_turn",
  "usage": { "input_tokens": 10, "output_tokens": 5 }
}

Cross-format routing

Send an OpenAI-format request to an Anthropic model — the gateway translates the request and returns an OpenAI-format response:
curl http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer $ANTHROPIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
System messages are correctly forwarded: the gateway extracts {"role": "system"} messages and passes them as Anthropic’s top-level system field.

Streaming

Add "stream": true to any request to receive Server-Sent Events:
curl -N http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Count to 3"}],
    "stream": true
  }'
The gateway streams in the caller’s native SSE format. See Streaming for details on fast-path vs normalised-path behaviour.

Structured output

The response_format field is supported for OpenAI models:
{
  "model": "gpt-4o",
  "messages": [{"role": "user", "content": "List 3 colors as JSON"}],
  "response_format": { "type": "json_object" }
}
Supported types: text, json_object, json_schema. Anthropic models do not support json_object or json_schema — requests with these formats routed to Anthropic return 400 Bad Request.

Tool calls

The tools field is forwarded to the upstream provider. For Anthropic models, tool definitions are automatically translated to the input_schema format:
{
  "model": "gpt-4o",
  "messages": [{"role": "user", "content": "What's the weather in Paris?"}],
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get current weather for a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": { "type": "string" }
        },
        "required": ["location"]
      }
    }
  }]
}

Models

List all configured models or retrieve a single model. These are synthetic responses built from the gateway config — no upstream call is made. No authentication required.
# List all models
curl http://localhost:4000/v1/models
{
  "object": "list",
  "data": [
    { "id": "gpt-4o", "object": "model", "created": 0, "owned_by": "openai" },
    { "id": "claude-sonnet-4-6", "object": "model", "created": 0, "owned_by": "anthropic" }
  ]
}
# Get a single model
curl http://localhost:4000/v1/models/gpt-4o

Embeddings

Passthrough proxy to OpenAI-compatible providers. The request is routed by model name. Anthropic models return 400 Bad Request.
curl -X POST http://localhost:4000/v1/embeddings \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "text-embedding-3-small", "input": "hello world"}'
Input text is scanned by the security pipeline (PII + blocklist) when security is enabled. Embedding model names must be listed in the provider’s models config array.

Token counting

Passthrough proxy to Anthropic’s /messages/count_tokens endpoint. Only Anthropic models are supported — OpenAI models return 400 Bad Request.
curl -X POST http://localhost:4000/v1/messages/count_tokens \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "How many tokens is this?"}]
  }'
Response: {"input_tokens": 14}

Image generation

Passthrough proxy to OpenAI-compatible providers’ DALL-E endpoint. Anthropic models return 400 Bad Request.
curl -X POST http://localhost:4000/v1/images/generations \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "dall-e-3", "prompt": "a white cat", "n": 1, "size": "1024x1024"}'
The prompt field is scanned by the security pipeline when enabled. Image model names must be listed in the provider’s models config array.

Audio transcription

Passthrough proxy to OpenAI’s Whisper endpoint. The request must be multipart/form-data containing an audio file. No security scanning is applied (binary audio input).
curl -X POST http://localhost:4000/v1/audio/transcriptions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -F file=@recording.mp3 \
  -F model=whisper-1

Audio speech (TTS)

Passthrough proxy to OpenAI’s text-to-speech endpoint. Anthropic models return 400 Bad Request.
curl -X POST http://localhost:4000/v1/audio/speech \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "tts-1", "input": "Hello world", "voice": "alloy"}' \
  --output speech.mp3
The input field is scanned by the security pipeline when enabled.

Files

Passthrough proxy to OpenAI’s file management endpoints. Supports upload (multipart), list, retrieve, download, and delete. Anthropic models return 400 Bad Request. No security scanning is applied.
# Upload a file
curl -X POST http://localhost:4000/v1/files \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -F purpose=assistants \
  -F file=@data.jsonl

# List files
curl http://localhost:4000/v1/files \
  -H "Authorization: Bearer $OPENAI_API_KEY"

# Retrieve file metadata
curl http://localhost:4000/v1/files/file-abc123 \
  -H "Authorization: Bearer $OPENAI_API_KEY"

# Download file content
curl http://localhost:4000/v1/files/file-abc123/content \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  --output downloaded.jsonl

# Delete a file
curl -X DELETE http://localhost:4000/v1/files/file-abc123 \
  -H "Authorization: Bearer $OPENAI_API_KEY"

Error responses

ScenarioStatus
Unknown model404 Not Found
Security violation (input PII, blocklist, or output guardrail)400 Bad Request
Provider returned an error502 Bad Gateway
Unsupported response_format for provider400 Bad Request
Error body format:
{ "error": { "message": "<description>" } }