Skip to content

openai/preparedness

Repository files navigation

Preparedness Evals

This repository contains the code for multiple Preparedness evals that use nanoeval and alcatraz.

System requirements

  1. Python 3.11 (3.12 is untested; 3.13 will break chz)

Install pre-requisites

for proj in nanoeval alcatraz nanoeval_alcatraz; do
    pip install -e project/"$proj"
done

Evals

  • PaperBench
  • SWELancer (Forthcoming)
  • MLE-bench (Forthcoming)

About

Releases from OpenAI Preparedness

Resources

License

Stars

Watchers

Forks