Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: lightweight probe defaults #1116

Open
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

leondz
Copy link
Collaborator

@leondz leondz commented Feb 26, 2025

Resolves #1032

  • Cap probes at a certain number of requests for standard version, full version can be present but inactive
  • Make lightweight probes the default, moving larger probes out to -Full versions
  • Add a config value to set a suggested cap on number of prompts per default probe
  • Migrate many probes to use random shuffling + this cap when reducing probe count (shuffling preferred because a. we're not a benchmark, we're about discovery; b. getting variance is good, we don't want to overfit to subsets of the test cases)
  • Add fixers for renames

(NB Getting some black churn)

Verification

  • garak --list_probes reveals no -Mini or -80 (etc) probe names
  • garak --list_probes shows all -Full probes are inactive
  • garak -m test -g 1 shows no probe having over 256 prompts (config.run.soft_probe_prompt_cap) - calculations for 256 available on request
  • altering config.run.soft_probe_prompt_cap changes #prompts for affected probes

@leondz leondz added the probes Content & activity of LLM probes label Feb 26, 2025
@leondz leondz added this to the 25.02 Efficiency milestone Feb 26, 2025
@leondz leondz changed the title lightweight probe defaults feature: lightweight probe defaults Feb 26, 2025
@leondz leondz marked this pull request as draft February 26, 2025 21:59
Copy link
Collaborator

@erickgalinkin erickgalinkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Largely looks good. I think this is good to merge but would like to resolve the discussion @jmartin-tech brought up and my minor quibble with the DanInTheWildMini docstring.

@@ -7,7 +7,7 @@ run:
generations: 5

plugins:
probe_spec: continuation,dan,encoding.InjectBase64,encoding.InjectHex,goodside,av_spam_scanning,leakreplay,lmrc,malwaregen.SubFunctions,malwaregen.TopLevel,packagehallucination,realtoxicityprompts.RTPIdentity_Attack,realtoxicityprompts.RTPProfanity,realtoxicityprompts.RTPSexually_Explicit,realtoxicityprompts.RTPThreat,snowball,xss
probe_spec: ansiescape.AnsiRaw,continuation,dan,encoding.InjectBase64,encoding.InjectHex,goodside,av_spam_scanning,leakreplay,lmrc,malwaregen.SubFunctions,malwaregen.TopLevel,packagehallucination,realtoxicityprompts.RTPIdentity_Attack,realtoxicityprompts.RTPProfanity,realtoxicityprompts.RTPSexually_Explicit,realtoxicityprompts.RTPThreat,snowball,xss
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still want av_spam_scanning in the default fast config? It's largely useless for model-only evaluation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. garak's mostly been used for scanning models so far, but this isn't an intended constraint on use. I appreciate the opportunity to have a "we should be scanning systems" talk whenever this module comes up. On the other hand, it takes time to have that talk.

Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR assumes config_root is always the global module instance _config. An enhancement to enable _config.run items to be distributed in a consistent way for implementers of Configurable is planned and will be needed for this PR.

jmartin-tech and others added 5 commits March 3, 2025 08:46
Signed-off-by: Jeffrey Martin <[email protected]>
* plugins define what run and system params to support
* parallel requests is configurable for genertors instantiated by plugins

Signed-off-by: Jeffrey Martin <[email protected]>
Copy link
Collaborator Author

@leondz leondz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@leondz leondz marked this pull request as ready for review March 4, 2025 10:14
Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor improvement might be to consolidate _prune_data() as that code in that method looks to be duplicated in a few classes.

@jmartin-tech
Copy link
Collaborator

A separate note, this PR suggests that the idea of tracking renames with aliases in the plugin may be something we could remove for now. Hence I don't think we need to add them here.

Copy link
Collaborator

@erickgalinkin erickgalinkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Only thought: Do we want to add a warning that these probes have been renamed? If we detect the newly updated names (specifically, lots of folks like DanInTheWildMini) in the config, inform them that we've done the rename? I suppose this is something that can be captured in the release notes, but just trying to minimize support requests.

@leondz
Copy link
Collaborator Author

leondz commented Mar 7, 2025

Do we want to add a warning that these probes have been renamed?

Kinda. We don't want to say nothing! I'm leaning towards only putting this in the release notes seeing as we're 0.x - the other question, if we do more than that, would be for how long we hold the message around for. Which is more maintenance.

* defensive coding for _plugin rename util
* use `raw` string for regex
* add tests for minimal and regex paths

Signed-off-by: Jeffrey Martin <[email protected]>
@jmartin-tech
Copy link
Collaborator

A note from testing, likely worth a new issue. The fixer migrations output the yaml with sorted keys, making comparison less than idea.

@leondz
Copy link
Collaborator Author

leondz commented Mar 10, 2025

A note from testing, likely worth a new issue. The fixer migrations output the yaml with sorted keys, making comparison less than idea.

Oh, that's.. cute. Yeah, I can see the advantage to preserving input YAML order on some levels. On the other hand, if we're dealing with items under plugins, run, system etc - maybe this is a fine convention to adopt?

@leondz
Copy link
Collaborator Author

leondz commented Mar 11, 2025

prune refactor sent, + test added. I think that may be three LGTMs..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
probes Content & activity of LLM probes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

align prompt count per probe
3 participants