Skip to content

Latest commit

 

History

History
31 lines (23 loc) · 1.17 KB

README.md

File metadata and controls

31 lines (23 loc) · 1.17 KB

MLC @ UVA DDPG Baseline

MLC Logo

A Pytorch implementation of Deep Determinisitc Policy Gradient for simple continuous control tasks.

More info: Continuous control with deep reinforcement learning

Pretrained Agents

Watching a pretrained agent on Pendulum-v0:

python run.py --env Pendulum-v0 --agent saves/pretrained_pendulum --episodes 10

or on MountainCarContinuous-v0:

python run.py --env MountainCarContinuous-v0 --agent saves/pretrained_mountaincar --episodes 10

Train Agents

python ddpg.py

There are a ton of CL flags. See the bottom of ddpg.py for a full list, but here are the important ones:

  • --env is the gym environment id. Options are MountainCarContinuous-v0 and Pendulum-v0
  • --num_episodes is how many episodes of experience to collect during training. Defaults to 500.
  • --batch_size is how many sample transitions are passed through the networks at once during training. Defaults to 128. This may need to be reduced when running on CPUs.
  • --render is either 1 or 0. 1 lets you watch the agent as it learns. This slows the process down.