Configuration Reference

The gateway loads its configuration from a TOML file. By default it reads config/gateway.toml relative to the working directory. All sections except [server] are optional. For an explanation of routing concepts, see Routing. For provider-specific details, see Providers.

Full config reference

[server]
host = "127.0.0.1"
port = 4000

[defaults.provider]
timeout_ms = 30000

[providers.openai]
base_url = "https://api.openai.com/v1"
models = ["gpt-4o", "gpt-4o-mini", "o3"]
timeout_ms = 30000
# credential defaults to "env::OPENAI_API_KEY" for the openai provider

[providers.anthropic]
base_url = "https://api.anthropic.com/v1"
credential = "env::ANTHROPIC_API_KEY"
models = ["claude-sonnet-4-6", "claude-haiku-4-5"]
timeout_ms = 60000

[models.o3]
timeout_ms = 120000

[targets.openai-primary]
model = "gpt-4o"
weight = 70
credential = "env::MANAGED_OPENAI_KEY"

[targets.openai-fallback]
model = "gpt-4o"
weight = 30
credential = "env::MANAGED_OPENAI_KEY_2"

[routes.balanced-gpt4o]
endpoint = "chat"
models = ["gpt-4o"]
strategy = "weighted"
targets = ["openai-primary", "openai-fallback"]

[functions.summarize]
endpoint = "chat"
strategy = "fallback"
models = ["gpt-4o", "claude-sonnet-4-6"]

[functions.summarize.retry]
max_retries = 4
backoff_base_ms = 1000

[routing.retry]
max_retries = 2
backoff_base_ms = 500

[security]
enabled = true
blocklist = ["ignore previous instructions", "dan mode"]

# [tenant_routing]
# enabled = true
# redis_url = "redis://localhost:6379"
# lru_capacity = 1000
# anti_entropy_interval_secs = 300
# api_fallback_url = "http://localhost:4500/internal/routing/configs"
# api_fallback_timeout_ms = 5000
# api_fallback_auth_token = ""
# lru_idle_timeout_secs = 3600
# private_key_path = "/etc/gateway/x25519.key"

[telemetry]
enabled = true
otlp_endpoint = "http://localhost:4317"
service_name = "gateway"
environment = "production"
sample_ratio = 0.1
log_level = "info"

[genai_telemetry]
enabled = false
otlp_endpoint = "http://localhost:4317"
service_name = "gateway-genai-telemetry"
environment = "production"
sample_ratio = 1.0
metrics_enabled = true
metrics_export_interval_secs = 30
metrics_export_timeout_secs = 10
capture_content = true
otlp_bearer_token = ""

Section reference

`[server]`

Network binding for the gateway process.

Field	Type	Default	Description
`host`	string	`"127.0.0.1"`	Bind address
`port`	integer	`4000`	Listen port

`[defaults.provider]`

Global defaults applied to every provider unless the provider overrides them explicitly.

Field	Type	Default	Description
`timeout_ms`	integer	—	Default request timeout in milliseconds for all providers
`auth_type`	string	—	Default authentication method: `bearer`, `api_key_header`, or `query_param`

`[providers.*]`

Each named table registers a provider and the models it serves. The table key is the provider identifier (e.g., openai, anthropic, groq).

Field	Type	Required	Description
`base_url`	string	Yes	Provider API base URL
`models`	array	Yes	Model names this provider handles
`credential`	string	No	Credential location (e.g., `"env::OPENAI_API_KEY"`, `"none"`). When omitted, uses the per-provider convention default.
`timeout_ms`	integer	No	Request timeout in milliseconds. Overrides `[defaults.provider].timeout_ms`.
`auth_type`	string	No	Authentication method: `bearer` (default), `api_key_header`, or `query_param`. Overrides `[defaults.provider].auth_type`.
`auth_header_name`	string	No	Custom header name when `auth_type` is not standard (e.g., `"api-key"` for Azure OpenAI)

When credential is omitted, the gateway uses a per-provider convention based on the table key. For example, the openai provider defaults to env::OPENAI_API_KEY, and the anthropic provider defaults to env::ANTHROPIC_API_KEY. Set credential = "none" to disable credential resolution entirely for a provider.

`[models.*]`

Per-model overrides. The table key is the model name as it appears in a provider’s models array.

Field	Type	Default	Description
`timeout_ms`	integer	—	Request timeout in milliseconds. Overrides the provider timeout.

The timeout resolution order is: [models.*].timeout_ms > [providers.*].timeout_ms > [defaults.provider].timeout_ms > hardcoded default (30 000 ms).

`[targets.*]`

Each named table declares a managed routing target — a specific model paired with gateway-owned credentials. The table key is the target identifier referenced by routes and functions.

Field	Type	Required	Description
`model`	string	Yes	Upstream model name (must exist in a provider’s `models` array)
`credential`	string	No	Credential location for this target. Overrides the provider-level credential.
`weight`	integer	No	Weight for the `weighted` strategy (default: 1)
`timeout_ms`	integer	No	Request timeout in milliseconds. Overrides the provider timeout for this target.

When credential is set to an "env::VAR_NAME" location, the environment variable must be present at startup. The gateway fails fast on missing managed credentials rather than discovering the problem at request time.

`[routes.*]`

Each named table defines a managed route that maps one or more model names to a set of targets with a selection strategy. The table key is the route identifier.

Field	Type	Required	Description
`endpoint`	string	Yes	Endpoint type: `chat`, `embeddings`, `audio_speech`, `audio_transcription`, or `image_generation`
`models`	array	Yes	Model names this route handles
`strategy`	string	Yes	Selection strategy: `single`, `weighted`, or `fallback`
`targets`	array	Yes*	Target names for simple routes
`steps`	array	No	Multi-step route definition (overrides `targets`). Each step has its own `strategy` and `targets`.

When steps is present, each step defines its own strategy and targets. Steps execute as an ordered fallback chain. The top-level targets field is ignored when steps is provided.

`[routes.*.retry]`

Per-route retry configuration. When present, overrides the global [routing.retry] settings for this route.

Field	Type	Default	Description
`max_retries`	integer	`2`	Maximum retry attempts
`backoff_base_ms`	integer	`500`	Base backoff in milliseconds

`[functions.*]`

Each named table defines a function — a named routing alias that maps a task name to a strategy and set of models or targets. The table key is the function name (e.g., summarize, extract-entities). Callers invoke a function by sending model: "function::summarize" or, when no provider or route shares the name, simply model: "summarize" (resolved via top-down lookup).

Field	Type	Required	Description
`endpoint`	string	Yes	Endpoint type: `chat`, `embeddings`, `audio_speech`, `audio_transcription`, or `image_generation`
`strategy`	string	Yes	Selection strategy: `single`, `weighted`, `fallback`, or `experiment`
`models`	array	No	Inline model shorthand — model names resolved through configured providers
`targets`	array	No	Named target references (for weights, custom credentials, or timeouts)
`steps`	array	No	Multi-step definition. Each step has its own `strategy` and `targets`.

models and targets are mutually exclusive on a function. When models is used, the gateway creates ephemeral targets internally by resolving each model name through the provider configuration. Each inline model must exist in exactly one provider’s models array.

`[functions..variants.]`

When strategy = "experiment", variants define the A/B test arms. Each named table under variants is a variant configuration.

Field	Type	Required	Description
`model`	string	Yes*	Model name for this variant. Mutually exclusive with `target`.
`target`	string	No	Named target reference. Mutually exclusive with `model`.
`weight`	integer	Yes	Relative weight for random selection. Must be > 0.
(params)	varies	No	Additional key-value pairs override request parameters (e.g., `temperature`, `max_tokens`).

Params are validated against an allowlist per endpoint type. Unknown params pass through with a startup warning. Protected fields (model, messages, input, file, prompt, stream) cannot be set as params.

`EndpointType` enum

The endpoint field on functions and routes accepts one of:

Value	HTTP Endpoint
`chat`	`/v1/chat/completions`, `/v1/responses`, `/v1/messages`
`embeddings`	`/v1/embeddings`
`audio_speech`	`/v1/audio/speech`
`audio_transcription`	`/v1/audio/transcriptions`
`image_generation`	`/v1/images/generations`

`[functions.*.retry]`

Per-function retry configuration. When present, overrides the global [routing.retry] settings for this function.

Field	Type	Default	Description
`max_retries`	integer	`2`	Maximum retry attempts
`backoff_base_ms`	integer	`500`	Base backoff in milliseconds

`[routing]`

Global routing behaviour settings. This section is optional — when omitted, all sub-sections use their defaults.

Field	Type	Default	Description
`[routing.retry]`	table	absent	Global retry configuration for managed routes and functions
`[routing.circuit_breaker]`	table	—	Deprecated. Accepted for backward compatibility but ignored at runtime.

`[routing.retry]`

Global retry settings applied to all managed routes and functions that do not declare their own [routes.*.retry] or [functions.*.retry] block.

Field	Type	Default	Description
`max_retries`	integer	`2`	Maximum retry attempts
`backoff_base_ms`	integer	`500`	Base backoff in milliseconds

The retry layer wraps every managed-route attempt with exponential backoff (powered by the backon crate).

`[routing.circuit_breaker]` (deprecated)

The [routing.circuit_breaker] section is accepted for backward compatibility but ignored at runtime. The circuit breaker was removed in PR #232 and replaced by stateless fast sequential failover via FallbackStrategy combined with retry and exponential backoff.Use [routing.retry] with the fallback strategy instead. If enabled = true is set, a deprecation warning is logged at startup.

Field	Type	Default	Description
`enabled`	bool	`false`	Deprecated. Accepted but ignored.
`failure_threshold`	integer	`5`	Deprecated. Accepted but ignored.
`recovery_timeout_ms`	integer	`30000`	Deprecated. Accepted but ignored.

`[security]`

Inline security pipeline for input guardrails.

Field	Type	Default	Description
`enabled`	bool	`false`	Enable the inline security pipeline
`blocklist`	array	`[]`	Blocked terms for case-insensitive substring matching

When enabled, the security pipeline runs PII detection (always on) and blocklist matching on every request before it reaches the upstream provider. See Security Overview for details.

`[telemetry]`

Application-level telemetry (HTTP spans, latency). These traces go to your application observability stack (e.g., Tempo via Alloy).

Field	Type	Default	Description
`enabled`	bool	`true`	Enable application OTLP export
`otlp_endpoint`	string	`"http://localhost:4317"`	gRPC OTLP endpoint
`service_name`	string	`"gateway"`	OpenTelemetry service name
`environment`	string	`"production"`	Deployment environment tag
`sample_ratio`	float	`0.1`	Trace sampling ratio (0.0 to 1.0)
`log_level`	string	`"info"`	Log level filter

`[genai_telemetry]`

Dedicated GenAI provider telemetry sink. These traces and metrics go to the OTel Collector and on to ClickHouse.

Field	Type	Default	Description
`enabled`	bool	`false`	Enable GenAI OTLP export
`otlp_endpoint`	string	`"http://localhost:4317"`	gRPC OTLP endpoint
`service_name`	string	`"gateway-genai-telemetry"`	Service name for GenAI spans
`environment`	string	`"production"`	Deployment environment tag
`sample_ratio`	float	`1.0`	Trace sampling ratio (0.0 to 1.0)
`metrics_enabled`	bool	`true`	Enable histogram metrics export
`metrics_export_interval_secs`	integer	`30`	Metrics push interval in seconds
`metrics_export_timeout_secs`	integer	`10`	Metrics push timeout in seconds
`capture_content`	bool	`true`	Record prompt/completion text in spans
`otlp_bearer_token`	string	`""`	Bearer token for gRPC `Authorization` metadata

`[auth]`

Optional per-request platform authentication for customer-hosted gateways. When enabled, the gateway exchanges a platform credential for a short-lived collector JWT and exports GenAI telemetry with that request-scoped token.

Field	Type	Required	Default	Description
`enabled`	bool	No	`false`	Enable platform auth exchange
`exchange_url`	string	Yes*	—	Token exchange endpoint URL. Required when `enabled = true`.
`api_token`	string	No	—	Fallback API token when `x-to11-authorization` header is absent
`project_id`	string	No	—	Fallback project ID when `x-to11-project-id` header is absent
`cache_skew_seconds`	integer	No	`60`	Token cache skew tolerance in seconds
`exchange_timeout_ms`	integer	No	`2000`	HTTP timeout for the token exchange request in milliseconds

`[tenant_routing]`

Per-tenant routing configuration for platform mode (self-hosted or managed SaaS). When enabled, each project gets its own routing config loaded from Redis with API fallback. This section is only relevant when running in multi-tenant platform mode.

Field	Type	Required	Default	Description
`enabled`	bool	No	`false`	Enable per-tenant routing from DB/Redis. When `false`, the gateway uses static TOML config.
`redis_url`	string	No	`"redis://localhost:6379"`	Redis/Valkey connection URL for loading config snapshots
`lru_capacity`	integer	No	`1000`	Maximum number of tenant configs cached in memory
`anti_entropy_interval_secs`	integer	No	`300`	Interval between anti-entropy sweeps (seconds). Must be > `0`; invalid values fall back to the default.
`api_fallback_url`	string	No	—	Base URL for the API fallback endpoint (e.g., `http://api:4500/internal/routing/configs`)
`api_fallback_timeout_ms`	integer	No	`5000`	Timeout for API fallback requests in milliseconds
`api_fallback_auth_token`	string	No	—	Bearer token for authenticating API fallback requests. Sent as `Authorization: Bearer <token>`.
`lru_idle_timeout_secs`	integer	No	`3600`	Evict cached configs idle longer than this (seconds)
`private_key_path`	string	No	—	Path to 32-byte X25519 private key for decrypting Redis envelope-encrypted snapshots. When absent, the gateway expects plaintext JSON in Redis (dev mode only).

The api_fallback_auth_token should be a service-scoped API key with routing:read permission. The API fallback endpoint (GET /internal/routing/configs/:projectId) requires this permission. In production, additionally restrict the endpoint at the network level (VPC, security groups) to gateway instances only.

When private_key_path is not set, the gateway operates in plaintext mode — config snapshots in Redis are not encrypted. This is acceptable for local development but must not be used in production where Redis may contain provider API keys.

`[cache]`

Response caching backed by Valkey (Redis-compatible). When enabled, identical LLM requests can return instantly from cache.

Field	Type	Required	Default	Description
`enabled`	bool	No	`false`	Enable response caching
`url`	string	No	`"redis://localhost:6379"`	Valkey/Redis connection URL
`default_mode`	string	No	`"auto"`	Cache mode: `auto`, `always`, or `never`
`ttl_seconds`	integer	No	`604800`	Default TTL in seconds (1 week)
`max_ttl_seconds`	integer	No	`604800`	Maximum allowed TTL — caps per-request overrides
`max_entry_size_bytes`	integer	No	`1048576`	Maximum cached response size in bytes (1 MiB)
`pool_size`	integer	No	`8`	Connection pool size
`encrypt`	bool	No	`true`	AES-256-GCM encryption at rest
`encrypt_salt`	string	No	`""`	Server salt for HKDF key derivation. Required when `encrypt = true`.

When encrypt is true, encrypt_salt must be a non-empty string. The gateway refuses to start if encryption is enabled without a salt.

Config file location

Override the default path with the GATEWAY_CONFIG environment variable:

GATEWAY_CONFIG=/path/to/config.toml cargo run --release -p gateway

Environment variable overrides

Environment variables take precedence over TOML values.

Application telemetry

Env var	Overrides
`GATEWAY_TELEMETRY_ENABLED`	`[telemetry] enabled`
`OTEL_EXPORTER_OTLP_ENDPOINT`	`[telemetry] otlp_endpoint`
`SERVICE_NAME`	`[telemetry] service_name`
`ENVIRONMENT`	`[telemetry] environment`
`OTEL_TRACES_SAMPLER_RATIO`	`[telemetry] sample_ratio`

Tenant routing

Env var	Overrides
`GATEWAY_TENANT_ROUTING_ENABLED`	`[tenant_routing] enabled`
`GATEWAY_TENANT_ROUTING_REDIS_URL`	`[tenant_routing] redis_url`
`GATEWAY_TENANT_ROUTING_LRU_CAPACITY`	`[tenant_routing] lru_capacity`
`GATEWAY_TENANT_ROUTING_ANTI_ENTROPY_INTERVAL_SECS`	`[tenant_routing] anti_entropy_interval_secs`
`GATEWAY_TENANT_ROUTING_API_FALLBACK_URL`	`[tenant_routing] api_fallback_url`
`GATEWAY_TENANT_ROUTING_API_FALLBACK_TIMEOUT_MS`	`[tenant_routing] api_fallback_timeout_ms`
`GATEWAY_TENANT_ROUTING_API_FALLBACK_AUTH_TOKEN`	`[tenant_routing] api_fallback_auth_token`
`GATEWAY_TENANT_ROUTING_LRU_IDLE_TIMEOUT_SECS`	`[tenant_routing] lru_idle_timeout_secs`
`GATEWAY_TENANT_ROUTING_PRIVATE_KEY_PATH`	`[tenant_routing] private_key_path`

GenAI telemetry

Env var	Overrides
`GATEWAY_GENAI_TELEMETRY_ENABLED`	`[genai_telemetry] enabled`
`GATEWAY_GENAI_OTLP_ENDPOINT`	`[genai_telemetry] otlp_endpoint`
`GATEWAY_GENAI_SERVICE_NAME`	`[genai_telemetry] service_name`
`GATEWAY_GENAI_ENVIRONMENT`	`[genai_telemetry] environment`
`GATEWAY_GENAI_SAMPLE_RATIO`	`[genai_telemetry] sample_ratio`
`GATEWAY_GENAI_CAPTURE_CONTENT`	`[genai_telemetry] capture_content`
`GATEWAY_GENAI_OTLP_BEARER_TOKEN`	`[genai_telemetry] otlp_bearer_token`

Logging

Control log verbosity with the RUST_LOG environment variable:

RUST_LOG=debug cargo run -p gateway
RUST_LOG=gateway_core=trace,info cargo run -p gateway

Get Started

Concepts

Routing

Reference

Security

Telemetry

Configuration Reference

Configuration Reference

Full config reference

Section reference

`[server]`

`[defaults.provider]`

`[providers.*]`

`[models.*]`

`[targets.*]`

`[routes.*]`

`[routes.*.retry]`

`[functions.*]`

`[functions..variants.]`

`EndpointType` enum

`[functions.*.retry]`

`[routing]`

`[routing.retry]`

`[routing.circuit_breaker]` (deprecated)

`[security]`

`[telemetry]`

`[genai_telemetry]`

`[auth]`

`[tenant_routing]`

`[cache]`

Config file location

Environment variable overrides

Application telemetry

Tenant routing

GenAI telemetry

Logging

Get Started

Concepts

Routing

Reference

Security

Telemetry

Documentation Index

​Configuration Reference

​Full config reference

​Section reference

​[server]

​[defaults.provider]

​[providers.*]

​[models.*]

​[targets.*]

​[routes.*]

​[routes.*.retry]

​[functions.*]

​[functions.*.variants.*]

​EndpointType enum

​[functions.*.retry]

​[routing]

​[routing.retry]

​[routing.circuit_breaker] (deprecated)

​[security]

​[telemetry]

​[genai_telemetry]

​[auth]

​[tenant_routing]

​[cache]

​Config file location

​Environment variable overrides

​Application telemetry

​Tenant routing

​GenAI telemetry

​Logging

Configuration Reference

Full config reference

Section reference

`[server]`

`[defaults.provider]`

`[providers.*]`

`[models.*]`

`[targets.*]`

`[routes.*]`

`[routes.*.retry]`

`[functions.*]`

`[functions..variants.]`

`EndpointType` enum

`[functions.*.retry]`

`[routing]`

`[routing.retry]`

`[routing.circuit_breaker]` (deprecated)

`[security]`

`[telemetry]`

`[genai_telemetry]`

`[auth]`

`[tenant_routing]`

`[cache]`

Config file location

Environment variable overrides

Application telemetry

Tenant routing

GenAI telemetry

Logging