Skip to content

Guardrails

Guardrails keep your agent safe and on-task.

Why Guardrails

Agents can: - Go off-topic - Expose sensitive data - Take risky actions - Fall for manipulation

Guardrails catch these before damage happens.

Built-In Protection

from pauhu import Agent, Guardrail

agent = Agent(
    name="Support",
    instructions="Help customers with orders.",
    input_guardrails=[
        Guardrail.relevance(),      # Stay on topic
        Guardrail.safety(),         # Block attacks
        Guardrail.pii_filter(),     # Protect data
    ],
)

Custom Guardrails

from pauhu import input_guardrail, GuardrailOutput

@input_guardrail
async def block_competitors(ctx, agent, input):
    """Don't discuss competitor products."""
    competitors = ["acme", "globex", "initech"]
    is_blocked = any(c in input.lower() for c in competitors)
    return GuardrailOutput(tripwire_triggered=is_blocked)

agent = Agent(
    input_guardrails=[block_competitors],
)

Guardrail Types

Type Purpose
Relevance Keep responses on-topic
Safety Block jailbreaks and attacks
PII Filter Prevent data exposure
Moderation Block harmful content
Tool Safeguards Approve risky actions

Human Escalation

Some situations need a human:

@input_guardrail
async def escalation_check(ctx, agent, input):
    triggers = ["speak to human", "manager", "complaint"]
    needs_human = any(t in input.lower() for t in triggers)
    if needs_human:
        return GuardrailOutput(
            tripwire_triggered=True,
            message="Transferring to human support..."
        )
    return GuardrailOutput(tripwire_triggered=False)