Skip to content

Repository for our papers: Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics and Uncertainty-Aware Robotic World Model Makes Offline Model-Based Reinforcement Learning Work on Real Robots

License

Notifications You must be signed in to change notification settings

leggedrobotics/robotic_world_model

Repository files navigation

Robotic World Model Extension for Isaac Lab

IsaacSim Isaac Lab Python Linux platform Windows platform pre-commit License

Overview

This repository extends Isaac Lab with environments and training pipelines for

and related model-based reinforcement learning methods.

It enables:

  • joint training of policies and neural dynamics models in Isaac Lab (online),
  • training of policies with learned neural network dynamics without any simulator (offline),
  • evaluation of model-based vs. model-free policies,
  • visualization of autoregressive imagination rollouts from learned dynamics,
  • visualization of trained policies in Isaac Lab.

Robotic World Model

Paper: Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics
Project Page: https://sites.google.com/view/roboticworldmodel

Uncertainty-Aware Robotic World Model

Paper: Uncertainty-Aware Robotic World Model Makes Offline Model-Based Reinforcement Learning Work on Real Robots
Project Page: https://sites.google.com/view/uncertainty-aware-rwm

Authors: Chenhao Li, Andreas Krause, Marco Hutter
Affiliation: ETH AI Center, Learning & Adaptive Systems Group and Robotic Systems Lab, ETH Zurich


Installation

  1. Install Isaac Lab (not needed for offline policy training)

Follow the official installation guide. We recommend using the Conda installation as it simplifies calling Python scripts from the terminal.

  1. Install model-based RSL RL

Follow the official installation guide of model-based RSL RL for model-based reinforcement learning to replace the rsl_rl_lib that comes with Isaac Lab.

  1. Clone this repository (outside your Isaac Lab directory)
git clone [email protected]:leggedrobotics/robotic_world_model.git
  1. Install the extension using the Python environment where Isaac Lab is installed
python -m pip install -e source/mbrl
  1. Verify the installation (not needed for offline policy training)
python scripts/reinforcement_learning/rsl_rl/train.py --task Template-Isaac-Velocity-Flat-Anymal-D-Init-v0 --headless

World Model Pretraining & Evaluation

Robotic World Model is a model-based reinforcement learning algorithm that learns a dynamics model and a policy concurrently.

Configure model inputs/outputs

You can configure the model inputs and outputs under ObservationsCfg_PRETRAIN in AnymalDFlatEnvCfg_PRETRAIN.

Available components:

  • SystemStateCfg: state input and output head
  • SystemActionCfg: action input
  • SystemExtensionCfg: continuous privileged output head (e.g. rewards etc.)
  • SystemContactCfg: binary privileged output head (e.g. contacts)
  • SystemTerminationCfg: binary privileged output head (e.g. terminations)

And you can configure the model architecture and training hyperparameters under RslRlSystemDynamicsCfg and RslRlMbrlPpoAlgorithmCfg in AnymalDFlatPPOPretrainRunnerCfg .

Available options:

  • ensemble_size: ensemble size for uncertainty estimation
  • history_horizon: stacked history horizon
  • architecture_config: architecture configuration
  • system_dynamics_forecast_horizon: autoregressive prediction steps

Run dynamics model pretraining:

python scripts/reinforcement_learning/rsl_rl/train.py \
  --task Template-Isaac-Velocity-Flat-Anymal-D-Pretrain-v0 \
  --headless

It trains a PPO policy from scratch, while the induced experience during training is used to train the dynamics model.

Visualize autoregressive predictions

python scripts/reinforcement_learning/rsl_rl/visualize.py \
  --task Template-Isaac-Velocity-Flat-Anymal-D-Visualize-v0 \
  --checkpoint <checkpoint_path> \
  --system_dynamics_load_path <dynamics_model_path>

It visualizes the learned dynamics model by rolling out the model autoregressively in imagination, conditioned on the actions from the learned policy. The dynamics_model_path should point to the pretrained dynamics model checkpoint (e.g. model_<iteration>.pt) inside the saved run directory.


Model-Based Policy Training & Evaluation

Once a dynamics model is pretrained, you can train a model-based policy purely from imagined rollouts generated by the learned dynamics.

There are two options:

Option 1: Train policy in imagination online

The online data collection relies on interactions with the environment and thus brings up the simulator.

python scripts/reinforcement_learning/rsl_rl/train.py --task Template-Isaac-Velocity-Flat-Anymal-D-Finetune-v0 --headless --checkpoint <checkpoint_path> --system_dynamics_load_path <dynamics_model_path>

You can either start the policy from pretrained checkpoints or from scratch by simply omitting the --checkpoint argument.

Option 2: Train policy in imagination offline

The offline policy training does not request any new data and thus relies solely on the static dynamics model. Align the model architecture and specify the model load path under ModelArchitectureConfig in AnymalDFlatConfig.

Additionally, the offline imagination needs to branch off from some initial states. Specify the data path under DataConfig in AnymalDFlatConfig.

python scripts/reinforcement_learning/model_based/train.py --task anymal_d_flat

Play the learned model-based policy

You can play the learned policies with the original Isaac Lab task registry.

python scripts/reinforcement_learning/rsl_rl/play.py --task Isaac-Velocity-Flat-Anymal-D-Play-v0 --checkpoint <checkpoint_path>

Code Structure

We provide a reference pipeline that enables RWM and RWM-U on ANYmal D.

Key files:

Online

Offline


Citation

If you find this repository useful for your research, please consider citing:

@article{li2025robotic,
  title={Robotic world model: A neural network simulator for robust policy optimization in robotics},
  author={Li, Chenhao and Krause, Andreas and Hutter, Marco},
  journal={arXiv preprint arXiv:2501.10100},
  year={2025}
}
@article{li2025offline,
  title={Offline Robotic World Model: Learning Robotic Policies without a Physics Simulator},
  author={Li, Chenhao and Krause, Andreas and Hutter, Marco},
  journal={arXiv preprint arXiv:2504.16680},
  year={2025}
}

About

Repository for our papers: Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics and Uncertainty-Aware Robotic World Model Makes Offline Model-Based Reinforcement Learning Work on Real Robots

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages