Explore MS promptbench #931

dcecchini · 2023-12-18T11:23:26Z

Explore the new tool released by Microsoft for evaluation of LLMs.

Brief description:

It consists of a wide range of LLMs and evaluation datasets, covering diverse tasks, evaluation protocols, adversarial prompt attacks, and prompt engineering techniques. As a holistic library, it also supports several analysis tools for interpreting the results. It is designed in a modular fashion, allowing to build evaluation pipelines for custom projects.

So, I think we should check what are the techniques they use to evaluate the models, as well as datasets they support, tasks, and analysis tools to interpret the results.

Github link: promptbench

ArshaanNazir added ⏭️ Next Release Issues or Request for the next release v2.1.0 Issue or request to be done in v2.1.0 release and removed ⏭️ Next Release Issues or Request for the next release labels Dec 20, 2023

ArshaanNazir added ⏭️ Next Release Issues or Request for the next release and removed v2.1.0 Issue or request to be done in v2.1.0 release labels Feb 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explore MS promptbench #931

Explore MS promptbench #931

dcecchini commented Dec 18, 2023 •

edited

Loading

Explore MS promptbench #931

Explore MS promptbench #931

Comments

dcecchini commented Dec 18, 2023 • edited Loading

dcecchini commented Dec 18, 2023 •

edited

Loading