Documentation Index
Fetch the complete documentation index at: https://to11.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Models
Model overrides are optional entries in[models.*] that let you adjust settings for individual models without changing the provider-wide configuration. They exist to handle the cases where a single model’s requirements diverge from the provider norm.
Definition
Each model override is a named TOML table under the[models] section. The table key must match a model name that appears in a provider’s models list.
Supported fields
| Field | Type | Description |
|---|---|---|
timeout_ms | integer | Request timeout in milliseconds for this specific model. |
timeout_ms is the only supported override. Additional per-model settings may be added in future releases.
When to use model overrides
The primary use case is reasoning models. Models likeo3 and o4-mini routinely take 60—120 seconds to respond, well beyond the 30-second default that works for most chat completion models. Rather than raising the timeout for the entire OpenAI provider (which would delay error detection for faster models like gpt-4o), you set a targeted override.
Timeout resolution chain
The gateway resolves timeouts by checking four levels in priority order. The first value found wins:gpt-4o and gpt-4o-mini time out after 30 seconds, o4-mini after 60 seconds, and o3 after 120 seconds.
Model overrides apply to every request for that model regardless of whether the request uses passthrough routing, managed routing, or function routing. The override is keyed on the model name, not the route or target.
Next steps
Providers
Provider configuration and credential resolution.
Configuration Reference
Full TOML configuration reference.