Security Pipeline
The gateway runs an inline security pipeline on every request before it reaches the upstream LLM provider. Detectors execute synchronously in the request path with fail-fast ordering — the first detector to returnBlock short-circuits the pipeline and returns 400 Bad Request.
Pipeline stages
The security pipeline runs three detector stages in order of increasing cost:| Stage | Detector | Latency | Description |
|---|---|---|---|
| 1 | Blocklist | ~nanoseconds | Case-insensitive substring matching against a configurable word list |
| 2 | PII Detection | ~500ns | Regex patterns for SSN, credit card numbers, phone numbers, IP addresses, email addresses |
| 3 | ONNX ML Detectors | ~milliseconds | Prompt injection detection and content moderation (Phase 2) |
How it works
DetectorResult with an action (Pass, Flag, Redact, or Block), confidence score, and optional detail message. When a detector blocks a request, the response includes which detector triggered and why. The request is never forwarded upstream.
Configuration
Enable the security pipeline ingateway.toml:
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | false | Enable the inline security pipeline |
blocklist | array | [] | Strings for case-insensitive substring matching |
enabled = true, PII detection is always active. The blocklist is only checked if the blocklist array is non-empty.
Endpoint coverage
Not all endpoints have guardrails — binary inputs (audio files) and metadata-only endpoints (models, files) are not scanned.| Endpoint | Input guardrails | Output guardrails | Notes |
|---|---|---|---|
| Chat completions / messages / responses | Yes | Yes | Full pipeline |
| Embeddings | Yes | No | Input text scanned; output is float vectors |
| Image generation | Yes | No | Prompt text scanned |
| Audio transcription | No | No | Binary audio input |
| Audio speech (TTS) | Yes | No | Input text scanned |
| Files | No | No | Binary/metadata only |
| Count tokens | No | No | No generation, no content risk |
| Models | No | No | Synthetic metadata from config |
Output guardrails and streaming
Output guardrail behaviour differs between streaming and non-streaming requests:| Mode | Behaviour |
|---|---|
| Non-streaming | Full response inspected before delivery. Violations return 400 Bad Request. |
| Streaming | Chunks are sent as they arrive. Response text is accumulated in a bounded buffer (64 KB cap). After the stream completes, accumulated text is checked — violations are logged for monitoring/alerting rather than blocking, since chunks have already been delivered. |
Blocklist detector
The blocklist detector performs case-insensitive substring matching against every message in the request. If any message contains a blocklist entry, the request is blocked immediately.PII detector
The PII detector uses compiled regex patterns to identify sensitive data in request messages:| Pattern | Examples |
|---|---|
| Social Security Numbers | 123-45-6789 |
| Credit card numbers | 4532-0151-1283-0366 |
| Phone numbers | 555-867-5309 |
| IP addresses | 192.168.1.100 |
| Email addresses | user@example.com |
ONNX ML detectors (Phase 2)
The ONNX-based detectors for prompt injection and content moderation are architecturally integrated into the pipeline but currently returnPass on all requests. When activated in a future release, they will run transformer models via the ONNX Runtime with the same fail-fast semantics.
The ort (ONNX Runtime) dependency is isolated in the gateway-security crate so that changes to other gateway crates never trigger ONNX recompilation.