Feat: Add support for local Hugging Face text classifiers by RobGeada · Pull Request #1612 · NVIDIA-NeMo/Guardrails

RobGeada · 2026-02-02T12:05:12Z

Description

Adds support to run local Hugging Face text classifiers as rails. The models are run inside the NeMo-Guardrails process, and as such, this feature is best suited for smaller predictive text models in the sub-100m parameter range.

Features:

Models are loaded either from the local HF cache or download when first called
Configuration of which predicted classes constitute a guardrail violation
Provides access to a huge number of open access guardrail models via Hugging Face Hub

Example config:

rails:
  config:
    huggingface_detector:
      models:
        # Example 1: Harmful content detection on GPU
        - model_repo: "ibm-granite/granite-guardian-hap-38m"
          descriptor: "Harmful and abusive language detector"
          blocked_classes: [0]
          device: "cuda"  # Load on GPU for faster inference

        # Example 2: Prompt injection detection on CPU
        - model_repo: "protectai/deberta-v3-base-prompt-injection-v2"
          descriptor: "Prompt injection detector"
          blocked_classes: ["INJECTION"]
          device: "cpu"  # Load on CPU
  input:
    flows:
      # Check user input for prompt injection attempts
      - huggingface detector check input $hf_model="protectai/deberta-v3-base-prompt-injection-v2"
  output:
    flows:
      # Check bot output for harmful content before sending to user
      - huggingface detector check output $hf_model="ibm-granite/granite-guardian-hap-38m"

Checklist

I've read the CONTRIBUTING guidelines.
I've updated the documentation if applicable.
I've added tests if applicable.
@mentions of the person or team responsible for reviewing proposed changes.

greptile-apps · 2026-02-02T12:08:33Z

Greptile Overview

Greptile Summary

Adds integration for local HuggingFace text classification models as guardrails. The implementation allows users to run any HuggingFace text classifier locally to detect harmful content, prompt injections, or other policy violations in user inputs, bot outputs, and tool messages.

Key Features

Local model execution: Models run entirely locally, no external API calls required
Flexible configuration: Support for multiple models with different purposes, configurable blocked classes (by label or index), and device placement (CPU/GPU)
Model caching: Models are cached after first load to optimize performance
Comprehensive testing: 867 lines of unit tests covering configuration, classification logic, device management, and error handling
Documentation: Detailed READMEs with examples, troubleshooting, and configuration guides

Critical Issues

Wrong dependency in pyproject.toml: The code imports transformers but the dependency file specifies sentence-transformers. This will cause the feature to work only when sentence-transformers is installed (which includes transformers transitively), but creates version mismatch risks and adds unnecessary dependencies. The extras name should also be huggingface not sentence-transformers for clarity.

Minor Issues

Documentation typo: Flow Can we deploy at Jetson Orin? #4 in the example README lists "output input" instead of "tool output"

Architecture

The implementation follows NeMo Guardrails patterns with configuration schemas in config.py, action handlers in actions.py, and both Colang v1 and v2 flow definitions. The detector integrates into the rails pipeline and can block content by raising exceptions or refusing to respond based on configuration.

Confidence Score: 3/5

Safe to merge after fixing the dependency issue in pyproject.toml
The implementation is well-designed with comprehensive testing and documentation. However, there's a critical dependency mismatch in pyproject.toml that specifies sentence-transformers instead of transformers, which could cause installation issues and version conflicts. Once this is corrected, the PR adds valuable functionality with proper error handling and device management.
pyproject.toml must be corrected before merge - wrong package specified (sentence-transformers vs transformers)

Important Files Changed

Filename	Overview
pyproject.toml	Added dependency for HuggingFace detector, but wrong package specified (sentence-transformers instead of transformers)
nemoguardrails/library/huggingface_detector/actions.py	Core implementation for HuggingFace text classifier integration with proper error handling, caching, and device management
examples/configs/huggingface_detector/README.md	Usage documentation with examples and configuration guide (minor typo on line 62)

Sequence Diagram

sequenceDiagram
    participant User
    participant NeMoGuardrails
    participant HFDetectorFlow as HuggingFace Detector Flow
    participant HFAction as huggingface_detector_check
    participant ModelLoader as _load_model_and_tokenizer
    participant HFHub as HuggingFace Hub
    participant Model as Classification Model
    
    User->>NeMoGuardrails: Send message
    NeMoGuardrails->>HFDetectorFlow: Execute input flow with $hf_model param
    HFDetectorFlow->>HFAction: Call HuggingfaceDetectorCheckAction(context_key, model_repo)
    
    HFAction->>HFAction: Extract text from context
    HFAction->>HFAction: Find model config by model_repo
    HFAction->>ModelLoader: Load model and tokenizer
    
    alt Model not in cache
        ModelLoader->>HFHub: Download model and tokenizer
        HFHub-->>ModelLoader: Return model files
        ModelLoader->>ModelLoader: Load AutoModelForSequenceClassification
        ModelLoader->>ModelLoader: Move to device (if specified)
        ModelLoader->>ModelLoader: Cache model
    else Model in cache
        ModelLoader->>ModelLoader: Return cached model
    end
    
    ModelLoader-->>HFAction: Return (model, tokenizer)
    
    HFAction->>HFAction: Convert blocked_classes to indices
    HFAction->>Model: Tokenize and classify text
    Model-->>HFAction: Return logits and predictions
    HFAction->>HFAction: Calculate probabilities (softmax)
    HFAction->>HFAction: Check if predicted class in blocked_classes
    
    HFAction-->>HFDetectorFlow: Return {allowed, detected_class, score, all_scores}
    
    alt Content is blocked (not allowed)
        HFDetectorFlow->>NeMoGuardrails: Send exception or refuse to respond
        NeMoGuardrails->>User: Block message
    else Content is allowed
        HFDetectorFlow-->>NeMoGuardrails: Continue processing
        NeMoGuardrails->>User: Process message normally
    end

greptile-apps

_{3 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

pyproject.toml

examples/configs/huggingface_detector/README.md

Signed-off-by: Rob Geada <[email protected]>

github-actions · 2026-02-02T14:42:33Z

Documentation preview

https://nvidia-nemo.github.io/Guardrails/review/pr-1612

greptile-apps bot reviewed Feb 2, 2026

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

pyproject.toml Outdated Show resolved Hide resolved

pyproject.toml Outdated Show resolved Hide resolved

examples/configs/huggingface_detector/README.md Outdated Show resolved Hide resolved

Add local Hugging Face text classifiers as new rail option

ab4b708

Signed-off-by: Rob Geada <[email protected]>

RobGeada force-pushed the HuggingfaceBuiltIn branch from adcad48 to ab4b708 Compare February 2, 2026 13:11

Update poetry lock

1adc5f7

Skip Hugging Face tests if not prereqs not installed

840871f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Add support for local Hugging Face text classifiers#1612

Feat: Add support for local Hugging Face text classifiers#1612
RobGeada wants to merge 3 commits intoNVIDIA-NeMo:developfrom
RobGeada:HuggingfaceBuiltIn

RobGeada commented Feb 2, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Feb 2, 2026

Confidence Score: 3/5

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RobGeada commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Features:

Example config:

Checklist

Uh oh!

greptile-apps bot commented Feb 2, 2026

Greptile Overview

Greptile Summary

Key Features

Critical Issues

Minor Issues

Architecture

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Feb 2, 2026

Documentation preview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

RobGeada commented Feb 2, 2026 •

edited

Loading