Skip to main content
to11 derives metrics from the GenAI spans it captures for each LLM call: latency, time-to-first-token, token usage, error rate, and finish reasons. These follow the OpenTelemetry GenAI semantic conventions. The names below are the conventional metric names; they also appear if you export your own telemetry alongside what to11 records. You view aggregate metrics in the to11 dashboard; dashboards over these signals are coming soon. There is no database to query directly. See Observe.

Histogram metrics

MetricUnitDescription
gen_ai.client.operation.durationsecondsEnd-to-end latency of each LLM call
gen_ai.client.token.usagetokensToken distribution, dimensioned by gen_ai.token.type (input or output)
gen_ai.server.time_to_first_tokensecondsTime to first token. For streaming, the time to the first content delta; for non-streaming, equal to the total request duration
gen_ai.server.request.durationsecondsUpstream request duration
gen_ai.server.time_per_output_tokensecondsGeneration time per output token
gen_ai.client.output_tokens_per_secondtokens/secondOutput token throughput
These are aggregate metric instruments. The same signals are also recorded as per-request span attributes, and a few carry a different name on spans than as a metric — for example, the throughput metric gen_ai.client.output_tokens_per_second is recorded on individual spans as gen_ai.server.output_tokens_per_second. The metric and the span attribute are distinct artifacts.

Counter metrics

MetricDescription
gen_ai.client.errorsError count, dimensioned by error.type
gen_ai.client.finish_reasonsFinish reason distribution, dimensioned by gen_ai.response.finish_reason
gen_ai.client.guardrail_violationsGuardrail hit count, dimensioned by gen_ai.guardrail.detector and gen_ai.guardrail.direction

Dimensions

Metrics carry these dimensions, letting you break signals down by operation, provider, and model:
DimensionMeaning
gen_ai.operation.nameOperation type
gen_ai.provider.nameProvider identity
gen_ai.request.modelRequested model
gen_ai.response.modelResponse model, when available
The token usage metric additionally carries gen_ai.token.type (input or output).

Span attributes

The attributes recorded on each span.

Observe

Where you view your data.