Skip to main content

Documentation Index

Fetch the complete documentation index at: https://to11.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Models

Model overrides are optional entries in [models.*] that let you adjust settings for individual models without changing the provider-wide configuration. They exist to handle the cases where a single model’s requirements diverge from the provider norm.

Definition

Each model override is a named TOML table under the [models] section. The table key must match a model name that appears in a provider’s models list.
[models.o3]
timeout_ms = 120000

[models.o4-mini]
timeout_ms = 60000

Supported fields

FieldTypeDescription
timeout_msintegerRequest timeout in milliseconds for this specific model.
Currently timeout_ms is the only supported override. Additional per-model settings may be added in future releases.

When to use model overrides

The primary use case is reasoning models. Models like o3 and o4-mini routinely take 60—120 seconds to respond, well beyond the 30-second default that works for most chat completion models. Rather than raising the timeout for the entire OpenAI provider (which would delay error detection for faster models like gpt-4o), you set a targeted override.

Timeout resolution chain

The gateway resolves timeouts by checking four levels in priority order. The first value found wins:
[models.o3] timeout_ms            <-- model-specific override
  --> [providers.openai] timeout_ms   <-- provider default
    --> [defaults.provider] timeout_ms  <-- global default section
      --> 30 000 ms                     <-- hardcoded fallback
A complete example putting it all together:
[defaults.provider]
timeout_ms = 30000

[providers.openai]
base_url = "https://api.openai.com/v1"
models   = ["gpt-4o", "gpt-4o-mini", "o3", "o4-mini"]
# timeout_ms omitted — falls through to defaults.provider (30s)

[models.o3]
timeout_ms = 120000    # 2 minutes for reasoning

[models.o4-mini]
timeout_ms = 60000     # 1 minute for lighter reasoning
With this configuration, gpt-4o and gpt-4o-mini time out after 30 seconds, o4-mini after 60 seconds, and o3 after 120 seconds.
Model overrides apply to every request for that model regardless of whether the request uses passthrough routing, managed routing, or function routing. The override is keyed on the model name, not the route or target.

Next steps

Providers

Provider configuration and credential resolution.

Configuration Reference

Full TOML configuration reference.