Skip to content

0.2.0: Faster collection, MARL compatibility and RLHF prototype

Compare
Choose a tag to compare
@vmoens vmoens released this 05 Oct 16:45
· 689 commits to main since this release
bf264e0

TorchRL 0.2.0

This release provides many new features and bug fixes.

TorchRL now publishes Apple Silicon compatible wheels.
We drop coverage of python 3.7 in favour of 3.11.

New and updated algorithms

Most algorithms have been cleaned and designed to reach (at least) SOTA results.

image

Compatibility with MARL settings has been drastically improved, and we provide a good amount of MARL examples within the library:

image

A prototype RLHF training script is also proposed (#1597)

A whole new category of offline RL algorithms have been integrated: Decision transformers.

New features

One of the major new features of the library is the introduction of the terminated / truncated / done distinction at no cost within the library. All third-party and primary environments are now compatible with this, as well as losses and data collection primitives (collector etc). This feature is also compatible with complex data structures, such as those found in MARL training pipelines.

All losses are now compatible with tensordict-free inputs, for a more generic deployment.

New transforms

Atari games can now benefit from a EndOfLifeTransform that allows to use the end-of-life as a done state in the loss (#1605)

We provide a KL transform to add a KL factor to the reward in RLHF settings.

Action masking is made possible through the ActionMask transform (#1421)

VC1 is also integrated for better image embedding.

  • [Feature] Allow sequential transforms to work offline by @vmoens in #1136
  • [Feature] ClipTransform + rename min/maximum -> low/high by @vmoens in #1500
  • [Feature] End-of-life transform by @vmoens in #1605
  • [Feature] KL Transform for RLHF by @vmoens in #1196
  • [Features] Conv3dNet and PermuteTransform by @xmaples in #1398
  • [Feature, Refactor] Scale in ToTensorImage based on the dtype and new from_int parameter by @hyerra in #1208
  • [Feature] CatFrames used as inverse by @BY571 in #1321
  • [Feature] Masking actions by @vmoens in #1421
  • [Feature] VC1 integration by @vmoens in #1211

New models

We provide GRU alongside LSTM for POMDP training.

MARL model coverage is now richer of a MultiAgentMLP and MultiAgentCNN! Other improvments for MARL include coverage for nested keys in most places of the library (losses, data collection, environments...)/

Other features (misc)

New environments and third-party improvements

We now cover SMAC-v2, PettingZoo, IsaacGymEnvs (prototype) and RoboHive. The D4RL dataset can now be used without the eponym library, which permit training with more recent or older versions of gym.

Performance improvements

We provide several speed improvements, in particular for data collection.

image

Bug fixes

Miscellaneous

New Contributors

A great THANKS to our contributors, in particular (but not in any particular order) @skandermoalla, @matteobettini, @BY571 and @albertbou92 for their tremendous dedication.

Full Changelog: v0.1.1...v0.2.0