Content Capture

Content capture records the full text of prompts and completions as span attributes. This is useful for debugging, evaluation, and quality monitoring — you can search for specific prompt patterns, compare model outputs across variants, and build evaluation datasets from production traffic. Content capture is enabled by default. Prompt and completion text is recorded verbatim on GenAI spans and stored in your configured telemetry backend.

Disable content capture

If you need to prevent prompt and completion text from being stored (e.g. for PII compliance), disable content capture in the gateway configuration or via an environment variable.

[genai_telemetry]
capture_content = false  # default: true

Or via environment variable:

GATEWAY_GENAI_CAPTURE_CONTENT=false

When enabled, content capture records the full prompt and completion text including any sensitive data such as PII, API keys, or confidential instructions. Only enable in development or trusted environments where you have appropriate data handling controls in place.

What’s captured

When content capture is active, the following attributes are added to each GenAI span:

Attribute	Description
`gen_ai.input.messages`	Serialised prompt messages from the request
`gen_ai.output.messages`	Serialised completion choices or accumulated stream content
`gen_ai.system_instructions`	System/developer prompt text, captured separately for easy filtering

The gen_ai.system_instructions attribute is recorded as a standalone string rather than embedded within gen_ai.input.messages. This separation makes it straightforward to query system prompts independently — for example, to audit which system prompts are in use across your fleet without parsing the full message array.

Media endpoint content capture

Content capture behaviour varies by endpoint because some endpoints handle binary data that cannot be meaningfully recorded as span attributes.

Endpoint	Input captured	Output captured
Chat completions	Full messages array	Full choices array (or streamed content)
Embeddings	Input text	— (vector data not captured)
Image generation	`prompt` field	`revised_prompt` from DALL-E 3
Audio speech (TTS)	`input` text	— (binary audio)
Audio transcription	— (binary audio)	Transcription `text` field

For chat completions, both streaming and non-streaming responses are captured. Streaming responses are accumulated from individual deltas into the final content before recording. Embeddings capture only the input text; the output vector data is omitted as it is not human-readable and would be prohibitively large.

Size limits

Streaming responses are accumulated in memory before being recorded as span attributes. To prevent unbounded memory growth, the gateway enforces size limits on accumulated content.

Path	Limit
Streaming accumulated content	64 KB (`MAX_ACCUMULATED_CONTENT_BYTES`)
Streaming tool call arguments	64 KB per tool (`MAX_TOOL_CALL_ARGUMENT_BYTES`)
Non-streaming response	Full response body captured (no limit)

When the streaming accumulation limit is reached, content is truncated. The span still records token counts, finish reasons, and all other attributes normally. Only the content capture attributes are affected by this limit.

Tool call content

When content capture is enabled, tool call child spans include additional attributes that record the function invocation details:

gen_ai.tool.call.arguments — the function arguments as JSON
gen_ai.input.messages — an object with the tool name and call ID

On the parent span, gen_ai.tool.definitions contains the JSON schema of all declared tools sent in the request. This attribute appears on the parent rather than on individual tool call child spans, since the tool definitions apply to the entire request. For requests that produce multiple tool calls, each child span carries its own gen_ai.tool.call.arguments value. The parent span’s gen_ai.tool.definitions is recorded once regardless of how many tool calls appear in the response.

Querying captured content

With content stored in ClickHouse, you can run SQL queries to search for specific patterns, debug unexpected outputs, or build evaluation datasets.

-- Find requests with specific content
SELECT
    Timestamp,
    SpanName,
    SpanAttributes['gen_ai.system_instructions'] AS system_prompt,
    SpanAttributes['gen_ai.input.messages'] AS input,
    SpanAttributes['gen_ai.output.messages'] AS output
FROM otel_traces
WHERE SpanAttributes['gen_ai.input.messages'] LIKE '%search%'
  AND Timestamp > now() - INTERVAL 1 HOUR
ORDER BY Timestamp DESC
LIMIT 20;

Because gen_ai.system_instructions is a separate attribute, you can efficiently filter by system prompt without parsing the full message JSON:

-- Audit system prompts across all requests
SELECT
    SpanAttributes['gen_ai.system_instructions'] AS system_prompt,
    SpanAttributes['gen_ai.request.model'] AS model,
    count() AS request_count
FROM otel_traces
WHERE SpanAttributes['gen_ai.system_instructions'] != ''
  AND Timestamp > now() - INTERVAL 24 HOUR
GROUP BY system_prompt, model
ORDER BY request_count DESC;

See Span Attributes for the complete attribute reference, including all content capture attributes and their types. See the configuration reference for all [genai_telemetry] fields.

Get Started

Concepts

Routing

Reference

Security

Telemetry

Content Capture

Content Capture

Disable content capture

What’s captured

Media endpoint content capture

Size limits

Tool call content

Querying captured content

Get Started

Concepts

Routing

Reference

Security

Telemetry

Documentation Index

​Content Capture

​Disable content capture

​What’s captured

​Media endpoint content capture

​Size limits

​Tool call content

​Querying captured content

​Related resources

Content Capture

Disable content capture

What’s captured

Media endpoint content capture

Size limits

Tool call content

Querying captured content

Related resources