Skip to content

feat(guardrail): add regex#105

Merged
bzp2010 merged 3 commits into
mainfrom
bzp/feat-regex-guardrail
May 14, 2026
Merged

feat(guardrail): add regex#105
bzp2010 merged 3 commits into
mainfrom
bzp/feat-regex-guardrail

Conversation

@bzp2010
Copy link
Copy Markdown
Collaborator

@bzp2010 bzp2010 commented May 14, 2026

Summary by CodeRabbit

  • New Features

    • Added regex-based guardrails for pattern matching on inbound requests and outbound responses, with configurable block reasons and validation of provided patterns.
  • Configuration

    • JSON schema and runtime config support updated to accept and validate regex guardrail entries.
  • Tests

    • Comprehensive tests and fixtures added to cover request/response blocking, pass-through, and invalid-pattern handling.
  • Chores

    • Test utilities and CI updated to support running guardrail integration tests (etcd tooling included).

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

📝 Walkthrough

Walkthrough

This pull request adds regex-based pattern matching guardrails alongside existing bedrock guardrails. The implementation includes a new regex guardrail config and runtime, JSON schema updates for validation, proxy wiring, and comprehensive integration tests across three API endpoints.

Changes

Regex Guardrails Feature

Layer / File(s) Summary
Dependencies and module exposure
Cargo.toml, crates/aisix-guardrail/Cargo.toml, crates/aisix-guardrail/src/guardrails/mod.rs
Adds regex crate to workspace dependencies, enables usage in both top-level and guardrail crates, and exposes regex guardrail types alongside bedrock in module exports, identifier constants, and config re-exports.
Regex guardrail core implementation
crates/aisix-guardrail/src/guardrails/regex.rs
Defines RegexGuardrailConfig with pattern string, optional block reason, and precompiled regex; implements custom deserialization to reject invalid patterns; adds meta and runtime types with async check method that blocks when regex matches text content in message payloads while ignoring non-text parts; includes unit tests for blocking, allowing, and error cases.
Configuration schema and entity support
src/config/entities/guardrails-schema.json, src/config/entities/guardrails.rs
Extends JSON schema to accept type: "regex" with conditional config validation via new $defs.regex definition; adds GuardrailConfig::Regex variant and updates guardrail_type() method; refactors validation into a helper; includes schema test cases for valid/invalid regex configs and deserialization tests for error handling and type preservation.
Proxy runtime integration
src/proxy/guardrails.rs
Imports RegexGuardrailRuntime and adds a match arm in configured_guardrail_runtime_from_configs to construct a guardrail handle for regex configs; updates test imports and adds a unit test verifying correct regex runtime name and stage support.
Test infrastructure and utilities
tests/utils/admin.ts, tests/utils/etcd.ts, tests/proxy/guardrail/shared.ts
Adds optional etcdPrefix parameter to startIsolatedAdminApp; introduces etcdPutJson utility for writing JSON values to etcd via etcdctl; defines RegexGuardrailFixture interface and setupOpenAiRegexGuardrailFixture function that orchestrates isolated admin startup, upstream registration, input/output guardrail creation, guarded model setup, and cleanup.
Integration tests across endpoints
tests/proxy/guardrail/chat-completions.test.ts, tests/proxy/guardrail/messages.test.ts, tests/proxy/guardrail/responses.test.ts
Adds test suites for /v1/chat/completions, /v1/messages, and /v1/responses endpoints; each suite verifies three scenarios: input guardrail blocks before upstream is called, safe input passes through with correct content, and output guardrail blocks matched responses while still recording upstream requests.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

  • api7/aisix#103: Both PRs modify the guardrail configuration pipeline—extending guardrail schema/config handling to support additional guardrail types.
  • api7/aisix#104: Both PRs update src/proxy/guardrails.rs to wire configured guardrail runtimes based on GuardrailConfig variants.

Suggested reviewers

  • membphis
  • LiteSun

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning)

Check name Status Explanation Resolution
Security Check ❌ Error Regex guardrail feature has sound internal security. However, .github/workflows/build.yaml etcdctl installation lacks SHA256 checksum verification, creating supply-chain risk. Add SHA256SUMS verification before installing etcdctl. Download SHA256SUMS and verify tarball checksum before extraction/installation in build.yaml lines 35-42.
E2e Test Quality Review ⚠️ Warning E2E tests omit boundary cases. etcdPutJson lacks error handling. Inconsistent error validation across test suites. CI security risk: etcdctl unverified download. Add error handling to etcdPutJson; Expand E2E scenarios with complex regex/boundary cases; Harmonize error assertions across all tests; Verify etcdctl checksum in CI.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat(guardrail): add regex' directly and concisely describes the main change: adding regex functionality to the guardrail system. It clearly summarizes the primary objective.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bzp/feat-regex-guardrail

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/proxy/guardrails.rs (1)

650-659: ⚡ Quick win

Use project-standard assertion macros in the new Rust test.

Please switch the new assertions to pretty_assertions::assert_eq! and assert_matches::assert_matches! for consistency with repository test standards.

Suggested fix
     #[test]
     fn configured_guardrail_runtime_from_configs_builds_regex_runtime() {
         let runtime = configured_guardrail_runtime_from_configs(&GuardrailConfig::Regex(
             RegexGuardrailConfig::new("secret", Some("matched blocked content".into())).unwrap(),
         ))
         .unwrap();
 
-        assert_eq!(runtime.name(), "regex");
-        assert!(runtime.supports_stage(GuardrailStage::Output));
+        pretty_assertions::assert_eq!(runtime.name(), "regex");
+        assert_matches::assert_matches!(runtime.supports_stage(GuardrailStage::Output), true);
     }
As per coding guidelines "`{tests,src}/**/*.rs`: Use pretty_assertions::assert_eq and assert_matches::assert_matches for better test output."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/proxy/guardrails.rs` around lines 650 - 659, The test
configured_guardrail_runtime_from_configs_builds_regex_runtime should use the
project-standard test macros: replace assert_eq!(runtime.name(), "regex") with
pretty_assertions::assert_eq!(runtime.name(), "regex") and replace the boolean
assert!(runtime.supports_stage(GuardrailStage::Output)) with
assert_matches::assert_matches!(runtime.supports_stage(GuardrailStage::Output),
true); ensure you add/import pretty_assertions::assert_eq and
assert_matches::assert_matches if not already in scope and keep the rest of the
test (configured_guardrail_runtime_from_configs, GuardrailConfig::Regex,
RegexGuardrailConfig::new) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/proxy/guardrail/shared.ts`:
- Around line 157-165: The cleanup logic for closing resources can leak the
server process if upstream.close() throws; update both the returned close()
implementation and the catch-block cleanup to use try/finally so server.exit()
always runs even when upstream.close() fails. Specifically, in the close: async
() => { ... } and in the catch { ... } replace the sequential awaits with a try
{ await upstream?.close(); } finally { await server?.exit(); } pattern (keeping
the throw in the catch handler) so both upstream.close() and server.exit() are
guaranteed to run; reference the close function and the catch block surrounding
upstream and server variables to locate the changes.
- Around line 23-31: The ensureStatus helper currently includes the full
response body via JSON.stringify(response.data) which may leak sensitive admin
fields; modify ensureStatus to avoid serializing the whole payload — instead log
a minimal/redacted summary (e.g., response.status plus a truncated or redacted
placeholder like "<redacted response>" or the first N chars of a safely
stringified value) and remove JSON.stringify(response.data) from the thrown
Error; update the error message construction in ensureStatus so it only includes
non-sensitive identifiers and the redacted/minimal payload indicator.

---

Nitpick comments:
In `@src/proxy/guardrails.rs`:
- Around line 650-659: The test
configured_guardrail_runtime_from_configs_builds_regex_runtime should use the
project-standard test macros: replace assert_eq!(runtime.name(), "regex") with
pretty_assertions::assert_eq!(runtime.name(), "regex") and replace the boolean
assert!(runtime.supports_stage(GuardrailStage::Output)) with
assert_matches::assert_matches!(runtime.supports_stage(GuardrailStage::Output),
true); ensure you add/import pretty_assertions::assert_eq and
assert_matches::assert_matches if not already in scope and keep the rest of the
test (configured_guardrail_runtime_from_configs, GuardrailConfig::Regex,
RegexGuardrailConfig::new) unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bb01b40c-6489-41b4-9e03-c3d909b50e36

📥 Commits

Reviewing files that changed from the base of the PR and between 1ad76e4 and e2931fd.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (14)
  • Cargo.toml
  • crates/aisix-guardrail/Cargo.toml
  • crates/aisix-guardrail/src/guardrails/mod.rs
  • crates/aisix-guardrail/src/guardrails/regex.rs
  • src/config/entities/guardrails-schema.json
  • src/config/entities/guardrails.rs
  • src/proxy/guardrails.rs
  • tests/package.json
  • tests/proxy/guardrail/chat-completions.test.ts
  • tests/proxy/guardrail/messages.test.ts
  • tests/proxy/guardrail/responses.test.ts
  • tests/proxy/guardrail/shared.ts
  • tests/utils/admin.ts
  • tests/utils/etcd.ts

Comment thread tests/proxy/guardrail/shared.ts
Comment thread tests/proxy/guardrail/shared.ts
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/build.yaml:
- Around line 35-43: The workflow step that downloads and installs etcdctl (env
ETCD_VER, the curl/tar/sudo install sequence and final etcdctl version check)
lacks checksum verification; update the step to also download the release SHA256
sums (e.g. the repository's SHA256SUMS or per-asset .sha256), verify the
downloaded tarball with sha256sum (or sha256sum -c) before running sudo install,
and fail the job when the checksum does not match so the invalid artifact is
never installed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a9a0cfe0-f4ab-4ab4-9fc5-c6533f7429e5

📥 Commits

Reviewing files that changed from the base of the PR and between e2931fd and 14ba36a.

📒 Files selected for processing (1)
  • .github/workflows/build.yaml

Comment thread .github/workflows/build.yaml
@bzp2010 bzp2010 merged commit b6e0b15 into main May 14, 2026
3 checks passed
@bzp2010 bzp2010 deleted the bzp/feat-regex-guardrail branch May 14, 2026 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant