Deterministic LLM testing in your CI/CD pipeline. This action evaluates recorded LLM outputs against defined contracts and fails PRs when violations are detected.
- Zero network calls - Tests run on recorded fixtures
- Rich reporting - HTML, JUnit, JSON output formats
- PR comments - Automatic violation summaries
- Budget tracking - Cost and latency monitoring
- Flexible checks - JSON schema, regex, numeric bounds, string contains/equals, list/set equality, file diff, custom functions
- 
Regression fail PR · Cost gate PR · Assertion fail PR [Links to live PRs and GIFs to be inserted after publishing] 
Below are example screenshots of the HTML report generated by this action.
| Before   | After   | 
name: PromptProof
on: [pull_request]
jobs:
  eval:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: geminimir/promptproof-action@v0
        with:
          config: promptproof.yaml
          baseline-ref: origin/main
          runs: 3
          seed: 1337
          max-run-cost: 2.50
          report-artifact: promptproof-report
          mode: gate| Input | Description | Default | 
|---|---|---|
| config | Path to promptproof.yaml | promptproof.yaml | 
| baseline-ref | Git ref to load baseline snapshot from (e.g., origin/main) | |
| runs | Number of runs for flake control | |
| seed | Seed for flake control determinism | |
| max-run-cost | Maximum total cost for this run (USD) | |
| report-artifact | Name of uploaded report artifact | promptproof-report | 
| mode | gate(fail) orreport-only(warn). Defaults to config. | |
| format | Output format ( html | junit | 
| regress | Also compare to local baseline | false | 
| node-version | Node.js version | 20 | 
| snapshot-on-success | Create snapshot after successful run | false | 
| snapshot-promote-on-main | Promote snapshot to baseline on main | false | 
| snapshot-tag | Optional snapshot tag | 
| Output | Description | 
|---|---|
| violations | Number of violations found | 
| passed | Number of fixtures that passed | 
| failed | Number of fixtures that failed | 
| failed-tests | Alias for failed | 
| total-cost | Total cost (USD) of this evaluation | 
| regressions | New failures vs baseline (when regression comparison is enabled) | 
| report-path | Path to generated report | 
Create a promptproof.yaml file in your repository:
schema_version: pp.v1
fixtures: fixtures/outputs.jsonl
checks:
  - id: no_pii
    type: regex_forbidden
    target: text
    patterns:
      - "[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,}"
budgets:
  cost_usd_per_run_max: 0.50
  latency_ms_p95_max: 2000
mode: failWhen using format: sarif, ensure your workflow grants Code Scanning upload permissions:
permissions:
  contents: read
  security-events: write- uses: geminimir/promptproof-action@v0
  with:
    config: promptproof.yaml
    baseline-ref: origin/main   # pull last green snapshot from main
    regress: true               # also compare with any local baseline
    runs: 5                     # flake control
    seed: 42                    # deterministic nondeterminism
    max-run-cost: 1.75          # cost gate for the entire suite
    format: junit               # emit JUnit XML for test tab
    mode: gate                  # fail on violations- uses: geminimir/promptproof-action@v0
  with:
    config: promptproof.yaml
    format: sarif- uses: geminimir/promptproof-action@v0
  with:
    config: promptproof.yaml
    max-run-cost: 1.00
    mode: report-only           # never fail directlyThen in Branch protection, require the "PromptProof" check so the PR is blocked when the budget is exceeded.
- uses: geminimir/promptproof-action@v0
  with:
    config: promptproof.yaml
    format: junit
    report-artifact: promptproof-report
    snapshot-on-success: true
    snapshot-promote-on-main: true
    snapshot-tag: nightlyNo API keys required. Use sample fixtures to see a green run:
name: PromptProof
on: [pull_request]
jobs:
  eval:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: geminimir/promptproof-action@v0
        with:
          config: example/promptproof.yaml
          format: html
          mode: report-onlyThis uses recorded fixtures under example/fixtures/ so CI makes no network calls.
- Settings → Branches → Branch protection rules → Add rule
- Branch name pattern = main
- Enable "Require status checks to pass" → select "PromptProof"
- Save
strategy:
  matrix:
    suite: [support, sales, docs]
steps:
  - uses: geminimir/promptproof-action@v0
    with:
      config: promptproof-${{ matrix.suite }}.yamlThe action automatically comments on PRs with:
- Violation summary grouped by check type
- Key metrics (cost, latency, pass/fail counts)
- Expandable details for each violation type
- A permalink to the run artifacts
Reports are uploaded as artifacts and retained for 30 days:
- HTML report for human review
- JSON report for programmatic access
- JUnit XML for test result visualization
- SARIF report for Code Scanning
MIT