Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting prompts based on Risk Category #4

Open
reinbugnot opened this issue Oct 29, 2024 · 1 comment
Open

Extracting prompts based on Risk Category #4

reinbugnot opened this issue Oct 29, 2024 · 1 comment

Comments

@reinbugnot
Copy link

Hi, I'm from NUS-NCS Cybersecurity Laboratory in SG.

I am interested in using the S-Eval dataset in our LLM risk evaluations. From the README.md file, there's a breakdown of how many prompts are available per Risk Category (i.e. Access Control, Hacker Attack, Malicious Code, etc. under Cybersecurity).

But the risk category information is currently not included in the .jsonl files inside s_eval/.

Can I ask if there's a way to group the prompts according to their risk categories? This will greatly help us in our use case. Thanks!

image
@zggg1p
Copy link
Collaborator

zggg1p commented Oct 29, 2024

Thanks for your support to our work. S-Eval has a detailed record of the detailed risk types (102 detailed risk subcategories in four levels) to which each prompt belongs. Currently, we have only made available the labels for the first-level risk dimensions. We will be releasing more fine-grained risk labels in the near future. Please stay tuned, and we will notify you as soon as they are available.

If our work is useful for your research, please star ⭐ our project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants