Learning to drive smoothly in minutes, using a reinforcement learning algorithm -- Soft Actor-Critic (SAC) -- and a Variational AutoEncoder (VAE) in the Donkey Car simulator.
Blog post on Medium: link
Level-0 | Level-1 |
---|---|
Download VAE | Download VAE |
Download pretrained agent | Download pretrained agent |
Note: the pretrained agents must be saved in logs/sac/
folder (you need to pass --exp-id 6
(index of the folder) to use the pretrained agent).
- Download simulator here or build it from source
- Install dependencies (cf requirements.txt)
- (optional but recommended) Download pre-trained VAE: VAE Level 0 VAE Level 1
- Train a control policy for 5000 steps using Soft Actor-Critic (SAC)
python train.py --algo sac -vae path-to-vae.pkl -n 5000
- Enjoy trained agent for 2000 steps
python enjoy.py --algo sac -vae path-to-vae.pkl --exp-id 0 -n 2000
To train on a different level, you need to change LEVEL = 0
to LEVEL = 1
in config.py
- Collect images using the teleoperation mode:
python -m teleop.teleop_client --record-folder path-to-record/folder/
- Train a VAE:
python -m vae.train --n-epochs 50 --verbose 0 --z-size 64 -f path-to-record/folder/
python train.py --algo sac -vae logs/vae.pkl -n 5000 --teleop
python -m teleop.teleop_client --algo sac -vae logs/vae.pkl --exp-id 0
python -m vae.enjoy_latent -vae logs/level-0/vae-8.pkl
To reproduce the results shown in the video, you have to check different values in config.py
.
config.py
:
MAX_STEERING_DIFF = 0.15 # 0.1 for very smooth control, but it requires more steps
MAX_THROTTLE = 0.6 # MAX_THROTTLE = 0.5 is fine, but we can go faster
MAX_CTE_ERROR = 2.0 # only used in normal mode, set it to 10.0 when using teleoperation mode
LEVEL = 0
Train in normal mode (smooth control), it takes ~5-10 minutes:
python train.py --algo sac -n 8000 -vae logs/vae-level-0-dim-32.pkl
Train in normal mode (very smooth control with MAX_STEERING_DIFF = 0.1
), it takes ~20 minutes:
python train.py --algo sac -n 20000 -vae logs/vae-level-0-dim-32.pkl
Train in teleoperation mode (MAX_CTE_ERROR = 10.0
), it takes ~5-10 minutes:
python train.py --algo sac -n 8000 -vae logs/vae-level-0-dim-32.pkl --teleop
Note: only teleoperation mode is available for level 1
config.py
:
MAX_STEERING_DIFF = 0.15
MAX_THROTTLE = 0.5 # MAX_THROTTLE = 0.6 can work but it's harder to train due to the sharpest turn
LEVEL = 1
Train in teleoperation mode, it takes ~10 minutes:
python train.py --algo sac -n 15000 -vae logs/vae-level-1-dim-64.pkl --teleop
Note: although the size of the VAE is different between level 0 and 1, this is not an important factor.
Related Paper: "Learning to Drive in a Day".
- r7vme Author of the original implementation
- Wayve.ai for idea and inspiration.
- Tawn Kramer for Donkey simulator and Donkey Gym.
- Stable-Baselines for DDPG/SAC and PPO implementations.
- RL Baselines Zoo for training/enjoy scripts.
- S-RL Toolbox for the data loader
- Racing robot for the teleoperation
- World Models Experiments for VAE implementation.