🛡️

Guardrails Design — Stopping Models Before They Go Wrong

Building safe harnesses with input filters, output validation, and action limits

Claude Code asks "should I delete this file?" That's an execution-stage guardrail. Delete without asking and there's no recovery.

Input Guardrails

Validate inputs before they reach the model.

Prompt injection defense — separate user input so it can't overwrite system prompts. Isolating input with XML tags or delimiters is standard.

Token length limits — when input exceeds the context window, you need a truncation strategy. Naive truncation drops critical info.

PII masking — mask personal data (emails, phone numbers, SSN) before sending to the model.

Execution Guardrails

Restrict model actions when using tools.

Allowlist-based tool use — whitelist callable tools. "File reads OK, file deletes need approval."

Timeout and retry limits — prevent infinite loops. If 5 retries fail, escalate.

Sandboxed execution — run code in isolated environments. Docker containers or VMs protect the host.

Output Guardrails

Validate model responses before delivering to users.

Schema validation — verify JSON output matches expected schema. Catch missing required fields and type mismatches.

Harmful content filters — block inappropriate model outputs. Sometimes checked by a separate classifier model.

Grounding checks — verify that information the model referenced actually exists in the provided context. Prevents hallucination.

How It Works

1

Input guardrails — prompt injection defense, token length limits, PII masking

2

Execution guardrails — allowlist-based tool use, timeout, sandbox

3

Output guardrails — schema validation, harmful content filter, grounding checks

4

All three stages needed for complete guardrails — one alone leaves gaps

Use Cases

Claude Code — approval before file changes/deletes, warnings for dangerous commands Customer support chatbot — PII masking + content filter + refund amount cap CI/CD pipeline AI — production deploy blocked, autonomous only to staging