Skip to content

Releases: allenai/catwalk

v1.0.1rc0

19 Apr 19:28
c3eb82e
Compare
Choose a tag to compare

What's Changed since v1.0.0rc0

Fixed

  • Added torch.no_grad() around model calls in language_model.py
  • Prevent crashes with more robust stop token for greedy_until in language_model.py

v1.0.0rc0

19 Dec 23:48
Compare
Choose a tag to compare
v1.0.0rc0 Pre-release
Pre-release

What's new

Added 🎉

  • Support for OPT-175B (AI2 only)
  • New detailed metrics for ranked classification in RankedClassificationMetrics.
  • New task for perplexity scoring over a set of jsonl files.
  • New model type "lm:" for general types of tasks handled by decoder-only language models.
  • run_lm_eval.py script.

Fixed ✅

  • Fixed the way we compute SQuAD metrics.
  • Fixed wikitext on GPT2
  • Fixed lambada on GPT2
  • Fixed the implementation of MultiRC

Commits

b9cc7df Merge pull request #160 from allenai/olmo-eval
ea5c47d Merge pull request #128 from allenai/FixMultiRC
bd5ccfa Merge pull request #125 from allenai/OPT175B
753f60a Merge pull request #115 from allenai/LambadaFix
9d02712 Merge pull request #109 from allenai/dependabot/pip/sphinx-6.0.0
e8b671e Merge pull request #120 from allenai/fix-ci
e122c63 Merge pull request #114 from allenai/dependabot/pip/torchmetrics-0.11.1
58d18a5 Merge pull request #110 from allenai/BigMatrix
3d50b9a Merge branch 'main' of https://github.com/allenai/lm-robustness
84cbcbf Simplify requirements

v0.2.2

27 Jan 22:08
Compare
Choose a tag to compare

What's new

Changed ⚠️

  • Changed the package name to ai2-catwalk to avoid a name conflict on Pypi.

Commits

8b95a39 Bump version number
e6fa0a5 Changelog
fbffe17 This has to be called ai2-catwalk to avoid a name conflict on Pypi.

v0.1.0

10 Jun 20:18
Compare
Choose a tag to compare
v0.1.0 Pre-release
Pre-release

This is the first release of Catwalk.