-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
I'd like to contribute an "Action Guardrails" example that complements existing input/output guardrails by enforcing authorization immediately before any effectful tool/API call (refunds, trades, data exports). This runs after the agent chooses an action but before side effects, enforcing agent identity (passport) and policy, failing closed by default, and emitting an auditable, immutable and verifiable decision ID.
This addresses a distinct concern from input/output guardrails:
- Input/output guardrails: Protect against malicious/unsafe data
- Action guardrails: Enforce business policies and identity on actions
The example demonstrates a generic, framework-agnostic pattern. It uses APort as the implementation example, but developers can adapt it to use their own authorization service or policy enforcement mechanism.
How APort Action Guardrails Fit Into the Flow
sequenceDiagram
participant User
participant InputGuardrail as Input Guardrails<br/>(Sanitize Prompt)
participant Agent
participant ActionGuardrail as APort Action Guardrails<br/>(Pre-Action Authorization)
participant Tool as Effectful Tool/API<br/>(Refund, Trade, Export)
participant OutputGuardrail as Output Guardrails<br/>(Validate Response)
User->>InputGuardrail: Raw prompt
InputGuardrail->>InputGuardrail: Sanitize & validate
InputGuardrail->>Agent: Clean prompt
Agent->>Agent: Process & decide action
Note over Agent: Agent chooses to:<br/>execute refund/trade/export
Agent->>ActionGuardrail: Verify(passport, policy, context)
ActionGuardrail->>ActionGuardrail: Check agent identity<br/>Enforce policy limits<br/>Validate context
alt Authorization: ALLOW
ActionGuardrail-->>Agent: Decision(allow=true, decision_id)
Agent->>Tool: Execute action
Tool-->>Agent: Result
Agent->>OutputGuardrail: Response
OutputGuardrail->>OutputGuardrail: Validate & transform
OutputGuardrail-->>User: Safe response
else Authorization: DENY
ActionGuardrail-->>Agent: Decision(allow=false, reasons)
Agent->>OutputGuardrail: Error response
OutputGuardrail-->>User: Safe decline + rationale
end
Three Types of Guardrails Comparison
flowchart TB
subgraph "Input Guardrails"
A1[User Prompt] --> A2[Sanitize & Validate]
A2 --> A3[Agent Receives<br/>Clean Prompt]
style A1 fill:#0277bd,stroke:#01579b,stroke-width:2px,color:#fff
style A2 fill:#0288d1,stroke:#01579b,stroke-width:2px,color:#fff
style A3 fill:#039be5,stroke:#01579b,stroke-width:2px,color:#fff
end
subgraph "Action Guardrails (APort)"
B1[Agent Decides Action] --> B2[APort Verify<br/>Pre-Action Auth]
B2 --> B3{Allow?}
B3 -->|Yes| B4[Execute Tool]
B3 -->|No| B5[Deny + Reasons]
style B1 fill:#e65100,stroke:#bf360c,stroke-width:2px,color:#fff
style B2 fill:#ff6f00,stroke:#bf360c,stroke-width:2px,color:#fff
style B4 fill:#ff8f00,stroke:#bf360c,stroke-width:2px,color:#fff
style B5 fill:#e65100,stroke:#bf360c,stroke-width:2px,color:#fff
end
subgraph "Output Guardrails"
C1[Tool Result] --> C2[Validate & Transform]
C2 --> C3[Safe Response<br/>to User]
style C1 fill:#6a1b9a,stroke:#4a148c,stroke-width:2px,color:#fff
style C2 fill:#7b1fa2,stroke:#4a148c,stroke-width:2px,color:#fff
style C3 fill:#8e24aa,stroke:#4a148c,stroke-width:2px,color:#fff
end
A3 --> B1
B4 --> C1
B5 --> C1
APort Authorization Flow Detail
flowchart LR
subgraph "Agent Runtime"
A[Agent] -->|1. Action Selected| B[Build Context]
B -->|2. Context| C[APort Client]
end
subgraph "APort Service"
C -->|3. Verify Request| D[Policy Evaluator]
D --> E{Check Passport}
E -->|Valid| F{Check Limits}
E -->|Invalid| G[Deny]
F -->|Within Limits| H{Check Policy Rules}
F -->|Exceeds Limits| G
H -->|Pass| I[Allow + Decision ID]
H -->|Fail| G
end
subgraph "Tool Execution"
I -->|4. Decision| J{Authorized?}
J -->|Yes| K[Execute Tool]
J -->|No| L[Return Error]
K --> M[Tool Result]
L --> M
end
style D fill:#2e7d32,stroke:#1b5e20,stroke-width:2px,color:#fff
style I fill:#388e3c,stroke:#1b5e20,stroke-width:2px,color:#fff
style G fill:#c62828,stroke:#b71c1c,stroke-width:2px,color:#fff
style K fill:#1565c0,stroke:#0d47a1,stroke-width:2px,color:#fff
Proposed contribution:
- Add
examples/agent_patterns/guardrails_action.pyshowing a small wrapper that verifies via an injected client before executing a tool; denies with reasons whenallow=false; attaches decision metadata to logs/trace. - Keep it SDK‑agnostic; show a minimal context shape (operation, resource, amount/currency, user/tenant, region).
- The example uses policies based on the Open Agent Passport (OAP) v1.0 specification for standardized policy evaluation.
References:
- Agents Python guardrail examples for analogy:
input_guardrails.py,output_guardrails.py - Prior art PR:
https://github.com/openai/openai-guardrails-python/pull/34 - Microsoft Agent Framework discussion:
https://github.com/microsoft/agent-framework/discussions/1701 - Standards reference: Open Agent Passport (OAP) v1.0 Specification — Runtime trust specification for AI agent authorization
- Policy examples: OAP Policy Packs — Standardized policy implementations for financial transactions, data access, and more
If maintainers agree, I'll open a tiny PR with one runnable example + short README.