v1.0.0rc0

Pre-release

Pre-release

github-actions released this 19 Dec 23:48

· 5 commits to main since this release

What's new

Added 🎉

Support for OPT-175B (AI2 only)
New detailed metrics for ranked classification in RankedClassificationMetrics.
New task for perplexity scoring over a set of jsonl files.
New model type "lm:" for general types of tasks handled by decoder-only language models.
run_lm_eval.py script.

Fixed ✅

Fixed the way we compute SQuAD metrics.
Fixed wikitext on GPT2
Fixed lambada on GPT2
Fixed the implementation of MultiRC

Commits

b9cc7df Merge pull request #160 from allenai/olmo-eval
ea5c47d Merge pull request #128 from allenai/FixMultiRC
bd5ccfa Merge pull request #125 from allenai/OPT175B
753f60a Merge pull request #115 from allenai/LambadaFix
9d02712 Merge pull request #109 from allenai/dependabot/pip/sphinx-6.0.0
e8b671e Merge pull request #120 from allenai/fix-ci
e122c63 Merge pull request #114 from allenai/dependabot/pip/torchmetrics-0.11.1
58d18a5 Merge pull request #110 from allenai/BigMatrix
3d50b9a Merge branch 'main' of https://github.com/allenai/lm-robustness
84cbcbf Simplify requirements

Assets 4