Metrics
The gateway exports histogram metrics and counters via the GenAI telemetry pipeline. All metrics are dimensioned by standard GenAI operation attributes and land in ClickHouse via the OTel Collector.Histogram metrics
| Metric | Unit | Description |
|---|---|---|
gen_ai.client.operation.duration | seconds | End-to-end latency for each LLM call |
gen_ai.client.token.usage | tokens | Token distributions, dimensioned by gen_ai.token.type (input or output) |
gen_ai.server.time_to_first_token | seconds | Streaming TTFT distribution |
gen_ai.server.request.duration | seconds | Total upstream request duration |
gen_ai.server.time_per_output_token | seconds | Per-token generation time |
gen_ai.server.output_tokens_per_second | tokens/sec | Token generation throughput |
Counter metrics
| Metric | Description |
|---|---|
gen_ai.client.errors | Error count, dimensioned by error.type |
gen_ai.client.guardrail_violations | Guardrail hit count, dimensioned by gen_ai.guardrail.detector and gen_ai.guardrail.direction |
gen_ai.client.finish_reasons | Finish reason distribution, dimensioned by gen_ai.response.finish_reason |
gateway.retry.attempts | Retry attempt count |
gateway.retry.target_exclusions | Target exclusion count, dimensioned by gateway.retry.excluded_target |
gateway.retry.exhaustion_resets | Exclusion reset count |
Metric dimensions
All histogram metrics carry these dimensions:| Dimension | Source |
|---|---|
gen_ai.operation.name | Operation type |
gen_ai.provider.name | Provider identity |
gen_ai.request.model | Requested model |
gen_ai.response.model | Response model (when available) |
server.address | Upstream hostname |
server.port | Upstream port |
gen_ai.token.type (input or output).