Robotic World Model Extension for Isaac Lab

Overview

This repository extends Isaac Lab with environments and training pipelines for

and related model-based reinforcement learning methods.

It enables:

joint training of policies and neural dynamics models in Isaac Lab (online),
training of policies with learned neural network dynamics without any simulator (offline),
evaluation of model-based vs. model-free policies,
visualization of autoregressive imagination rollouts from learned dynamics,
visualization of trained policies in Isaac Lab.

Paper: Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics
Project Page: https://sites.google.com/view/roboticworldmodel

Paper: Uncertainty-Aware Robotic World Model Makes Offline Model-Based Reinforcement Learning Work on Real Robots
Project Page: https://sites.google.com/view/uncertainty-aware-rwm

Authors: Chenhao Li, Andreas Krause, Marco Hutter
Affiliation: ETH AI Center, Learning & Adaptive Systems Group and Robotic Systems Lab, ETH Zurich

Installation

Install Isaac Lab (not needed for offline policy training)

Follow the official installation guide. We recommend using the Conda installation as it simplifies calling Python scripts from the terminal.

Install model-based RSL RL

Follow the official installation guide of model-based RSL RL for model-based reinforcement learning to replace the rsl_rl_lib that comes with Isaac Lab.

Clone this repository (outside your Isaac Lab directory)

git clone [email protected]:leggedrobotics/robotic_world_model.git

Install the extension using the Python environment where Isaac Lab is installed

python -m pip install -e source/mbrl

Verify the installation (not needed for offline policy training)

python scripts/reinforcement_learning/rsl_rl/train.py --task Template-Isaac-Velocity-Flat-Anymal-D-Init-v0 --headless

World Model Pretraining & Evaluation

Robotic World Model is a model-based reinforcement learning algorithm that learns a dynamics model and a policy concurrently.

Configure model inputs/outputs

You can configure the model inputs and outputs under ObservationsCfg_PRETRAIN in AnymalDFlatEnvCfg_PRETRAIN.

Available components:

SystemStateCfg: state input and output head
SystemActionCfg: action input
SystemExtensionCfg: continuous privileged output head (e.g. rewards etc.)
SystemContactCfg: binary privileged output head (e.g. contacts)
SystemTerminationCfg: binary privileged output head (e.g. terminations)

And you can configure the model architecture and training hyperparameters under RslRlSystemDynamicsCfg and RslRlMbrlPpoAlgorithmCfg in AnymalDFlatPPOPretrainRunnerCfg .

Available options:

ensemble_size: ensemble size for uncertainty estimation
history_horizon: stacked history horizon
architecture_config: architecture configuration
system_dynamics_forecast_horizon: autoregressive prediction steps

Run dynamics model pretraining:

python scripts/reinforcement_learning/rsl_rl/train.py \
  --task Template-Isaac-Velocity-Flat-Anymal-D-Pretrain-v0 \
  --headless

It trains a PPO policy from scratch, while the induced experience during training is used to train the dynamics model.

Visualize autoregressive predictions

python scripts/reinforcement_learning/rsl_rl/visualize.py \
  --task Template-Isaac-Velocity-Flat-Anymal-D-Visualize-v0 \
  --checkpoint <checkpoint_path> \
  --system_dynamics_load_path <dynamics_model_path>

It visualizes the learned dynamics model by rolling out the model autoregressively in imagination, conditioned on the actions from the learned policy. The dynamics_model_path should point to the pretrained dynamics model checkpoint (e.g. model_<iteration>.pt) inside the saved run directory.

Model-Based Policy Training & Evaluation

Once a dynamics model is pretrained, you can train a model-based policy purely from imagined rollouts generated by the learned dynamics.

There are two options:

Option 1: Train policy in imagination online, where additional environment interactions are continually collected using the latest policy to update the dynamics model (as implemented with RWM and MBPO-PPO in Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics).
Option 2: Train policy in imagination offline where no additional environment interactions are collected and the policy has to rely on the static dynamics model (as implemented with RWM-U and MOPO-PPO in Uncertainty-Aware Robotic World Model Makes Offline Model-Based Reinforcement Learning Work on Real Robots).

Option 1: Train policy in imagination online

The online data collection relies on interactions with the environment and thus brings up the simulator.

python scripts/reinforcement_learning/rsl_rl/train.py --task Template-Isaac-Velocity-Flat-Anymal-D-Finetune-v0 --headless --checkpoint <checkpoint_path> --system_dynamics_load_path <dynamics_model_path>

You can either start the policy from pretrained checkpoints or from scratch by simply omitting the --checkpoint argument.

Option 2: Train policy in imagination offline

The offline policy training does not request any new data and thus relies solely on the static dynamics model. Align the model architecture and specify the model load path under ModelArchitectureConfig in AnymalDFlatConfig.

Additionally, the offline imagination needs to branch off from some initial states. Specify the data path under DataConfig in AnymalDFlatConfig.

python scripts/reinforcement_learning/model_based/train.py --task anymal_d_flat

Play the learned model-based policy

You can play the learned policies with the original Isaac Lab task registry.

python scripts/reinforcement_learning/rsl_rl/play.py --task Isaac-Velocity-Flat-Anymal-D-Play-v0 --checkpoint <checkpoint_path>

Code Structure

We provide a reference pipeline that enables RWM and RWM-U on ANYmal D.

Key files:

Online

Environment configurations + dynamics model setup flat_env_cfg.py.
Algorithm configuration + training parameters rsl_rl_ppo_cfg.py.
Imagination rollout logic (constructs policy observations & rewards from model outputs) anymal_d_manager_based_mbrl_env.
Visualization environment + rollout reset anymal_d_manager_based_visualize_env.py.

Offline

Environment configurations + Imagination rollout logic (constructs policy observations & rewards from model outputs) anymal_d_flat.py.
Algorithm configuration + training parameters anymal_d_flat_cfg.py.
Pretrained RWM-U checkpoint pretrain_rnn_ens.pt.
Initial states for imagination rollout state_action_data_0.csv.

Citation

If you find this repository useful for your research, please consider citing:

@article{li2025robotic,
  title={Robotic world model: A neural network simulator for robust policy optimization in robotics},
  author={Li, Chenhao and Krause, Andreas and Hutter, Marco},
  journal={arXiv preprint arXiv:2501.10100},
  year={2025}
}
@article{li2025offline,
  title={Offline Robotic World Model: Learning Robotic Policies without a Physics Simulator},
  author={Li, Chenhao and Krause, Andreas and Hutter, Marco},
  journal={arXiv preprint arXiv:2504.16680},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.vscode		.vscode
assets		assets
logs/offline/anymal_d_flat/data/converted		logs/offline/anymal_d_flat/data/converted
scripts		scripts
source/mbrl		source/mbrl
.dockerignore		.dockerignore
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENCE		LICENCE
README.md		README.md
pyproject.toml		pyproject.toml
rwm-u.png		rwm-u.png
rwm.png		rwm.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Robotic World Model Extension for Isaac Lab

Overview

Installation

World Model Pretraining & Evaluation

Configure model inputs/outputs

Run dynamics model pretraining:

Visualize autoregressive predictions

Model-Based Policy Training & Evaluation

Option 1: Train policy in imagination online

Option 2: Train policy in imagination offline

Play the learned model-based policy

Code Structure

Citation

About

Uh oh!

Releases

Packages

Languages

License

leggedrobotics/robotic_world_model

Folders and files

Latest commit

History

Repository files navigation

Robotic World Model Extension for Isaac Lab

Overview

Installation

World Model Pretraining & Evaluation

Configure model inputs/outputs

Run dynamics model pretraining:

Visualize autoregressive predictions

Model-Based Policy Training & Evaluation

Option 1: Train policy in imagination online

Option 2: Train policy in imagination offline

Play the learned model-based policy

Code Structure

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages