Function Routing

To decouple task names from model names, define functions in your gateway configuration. Your application sends model: "function::summarize" instead of model: "gpt-4o", and the gateway resolves the function to the configured targets. Swapping models becomes a config change — no application code needs to change.

Prerequisites

At least one provider configured with one or more models.

Option A — Inline models shorthand

The simplest approach references model names directly. The gateway creates internal targets for each model automatically.

[providers.openai]
base_url = "https://api.openai.com/v1"
credential = "env::OPENAI_API_KEY"
models = ["gpt-4o", "gpt-4o-mini"]

[providers.anthropic]
base_url = "https://api.anthropic.com/v1"
credential = "env::ANTHROPIC_API_KEY"
models = ["claude-sonnet-4-6"]

[functions.summarize]
endpoint = "chat"
strategy = "fallback"
models = ["gpt-4o", "claude-sonnet-4-6"]

The gateway creates ephemeral targets from the models list. With the fallback strategy, it tries gpt-4o first. If that fails, it falls back to claude-sonnet-4-6. No [targets.*] blocks are needed.

Option B — Named targets

When you need per-target weights, custom credentials, or specific timeout overrides, reference named targets instead.

[providers.openai]
base_url = "https://api.openai.com/v1"
credential = "env::OPENAI_API_KEY"
models = ["gpt-4o"]

[targets.openai-primary]
model = "gpt-4o"
weight = 70
credential = "env::MANAGED_OPENAI_KEY"

[targets.openai-secondary]
model = "gpt-4o"
weight = 30
credential = "env::MANAGED_OPENAI_KEY_2"

[functions.extract]
endpoint = "chat"
strategy = "weighted"
targets = ["openai-primary", "openai-secondary"]

This distributes 70 % of extract traffic to one API key and 30 % to another.

Calling a function

Use the function:: prefix in the model field to route to a function.

curl http://localhost:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "function::summarize",
    "messages": [{"role": "user", "content": "Summarise this article..."}]
  }'

Unprefixed names also work — the gateway resolves unprefixed names top-down: function first, then route, then provider. If a function named summarize exists, it matches before any route or provider model with the same name. The explicit function:: prefix is recommended to make the routing layer visible and avoid surprises.

Resolution order

When the gateway receives an unprefixed model name, it resolves top-down through three layers.

model = "summarize" (unprefixed)
  |
  v
Layer 3: Does a function named "summarize" exist?
  +-- Yes --> function dispatch
  |
  +-- No
        |
        v
      Layer 2: Does a route match "summarize"?
        +-- Yes --> managed route dispatch
        |
        +-- No
              |
              v
            Layer 1: Is "summarize" in a provider's models list?
              +-- Yes --> passthrough
              +-- No  --> 404 Unknown Model

Explicit prefixes bypass the cascade: function::summarize goes directly to function dispatch without checking routes or providers.

Per-function retry

Override the global retry settings for a specific function.

[functions.summarize]
endpoint = "chat"
strategy = "fallback"
models = ["gpt-4o", "claude-sonnet-4-6"]

[functions.summarize.retry]
max_retries = 4
backoff_base_ms = 1000

This applies only to the summarize function. Other functions and routes use the global [routing.retry] settings.

Validation rules

A function must use exactly one of models, targets, or steps. The gateway rejects the configuration at load time if multiple fields are present. Each inline model name in the models list must exist in at least one provider’s models array. Functions also support multi-step fallback chains via steps — see Functions (concept) for details.

No fallthrough on failure

If a function matches but all its targets are exhausted (every target has failed and retries are spent), the gateway returns an error to the caller. It does not fall through to Layer 2 routes or Layer 1 passthrough. This behaviour is consistent with managed route handling.

Multi-endpoint functions

Functions work for all endpoint types. Set the endpoint field to declare which HTTP endpoint the function serves.

Embedding function

[functions.embed]
endpoint = "embeddings"
strategy = "single"
models = ["text-embedding-3-small"]

const embedding = await client.embeddings.create({
  model: "function::embed",
  input: "Search query text",
});

Image generation function

[functions.generate-image]
endpoint = "image_generation"
strategy = "single"
models = ["dall-e-3"]

const image = await client.images.generate({
  model: "function::generate-image",
  prompt: "A sunset over mountains",
});

Audio speech function

[functions.speak]
endpoint = "audio_speech"
strategy = "single"
models = ["tts-1"]

const speech = await client.audio.speech.create({
  model: "function::speak",
  input: "Hello, welcome to our platform.",
  voice: "alloy",
});

Audio transcription function

[functions.transcribe]
endpoint = "audio_transcription"
strategy = "single"
models = ["whisper-1"]

Audio transcription functions support model routing and model rewrite, but not variant param overrides. The multipart/form-data body format makes param injection fragile with binary audio data. Operators who need different language or response_format values should create separate functions.

Endpoint mismatch

If you send a request to the wrong endpoint, the gateway returns 400:

function "embed": endpoint mismatch — declared as embeddings, called from chat

This happens when model: "function::embed" is sent to /v1/chat/completions but the function has endpoint = "embeddings". Use the correct HTTP endpoint for the function’s declared type.

Variant params

When using strategy = "experiment", each variant can override request parameters. Params are flat key-value pairs at the same level as model and weight:

[functions.summarize.variants.fast]
model = "gpt-4o-mini"
weight = 50
temperature = 0.2
max_tokens = 500

See Experiment Routing for full details and examples.

Known fields per endpoint

Endpoint	Validated params
`chat`	`temperature`, `max_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, `seed`, `stop`, `response_format`, `n`
`embeddings`	`dimensions`, `encoding_format`
`image_generation`	`size`, `quality`, `style`, `n`, `response_format`
`audio_speech`	`voice`, `speed`, `response_format`

Unknown params are passed through to the provider with a startup warning. Cross-endpoint params (e.g., temperature on an embedding function) are rejected at startup.

Next steps

Routing Overview

Managed vs passthrough routing and the full resolution flow.

Experiment Routing

A/B test models and parameters with weighted variants.

Fallback Routing

Automatic failover between providers.

Configuration

Full TOML reference for all gateway settings.

Get Started

Concepts

Routing

Reference

Security

Telemetry

Function Routing

Function Routing

Prerequisites

Option A — Inline models shorthand

Option B — Named targets

Calling a function

Resolution order

Per-function retry

Validation rules

No fallthrough on failure

Multi-endpoint functions

Embedding function

Image generation function

Audio speech function

Audio transcription function

Endpoint mismatch

Variant params

Known fields per endpoint

Next steps

Routing Overview

Experiment Routing

Fallback Routing

Configuration

Get Started

Concepts

Routing

Reference

Security

Telemetry

Documentation Index

​Function Routing

​Prerequisites

​Option A — Inline models shorthand

​Option B — Named targets

​Calling a function

​Resolution order

​Per-function retry

​Validation rules

​No fallthrough on failure

​Multi-endpoint functions

​Embedding function

​Image generation function

​Audio speech function

​Audio transcription function

​Endpoint mismatch

​Variant params

​Known fields per endpoint

​Next steps

Routing Overview

Experiment Routing

Fallback Routing

Configuration

Function Routing

Prerequisites

Option A — Inline models shorthand

Option B — Named targets

Calling a function

Resolution order

Per-function retry

Validation rules

No fallthrough on failure

Multi-endpoint functions

Embedding function

Image generation function

Audio speech function

Audio transcription function

Endpoint mismatch

Variant params

Known fields per endpoint

Next steps