Using pytorch to implement Deep Deterministic Policy Gradient(DDPG).

Denpendency

python 3.6
pytorch 0.4+
tensorboard
gym

Train

main.py --train --env MountainCarContinuous-v0 --cuda

Parameters:

Parameters	description
--train	train model
--test	test model
--retrain	retrain model
--retrain_model	retrain model path
--env	gym environment name
--episodes	train episodes
--eps_decay	noise epsilon decay
--cuda	use cuda
--model_path	if test mode, import the model
--record	record the video
--record_ep_interval	record episodes interval
--checkpoint	use model checkpoint
--checkpoint_interval	checkpoint interval

(more parameters see the file)

You can use the tensorboard to see the training.

tensorboard --logdir=out/MountainCarContinuous-v0

Test

You can test your model with --test like this:

main.py --test --env MountainCarContinuous-v0 --model_path out/MountainCarContinuous-v0-run0

It will render graphical interface.

Result

It turns out that tuning parameters are very important, especially eps_decay. I use the simple linear noise decay such as epsilon -= eps_decay every episode.

Pendulum-v0

main.py --train --env Pendulum-v0 --cuda --eps_decay 0.01

MountainCarContinuous-v0

main.py --train --env MountainCarContinuous-v0 --cuda --eps_decay 0.001

Reference

paper Continuous control with deep reinforcement learning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Denpendency

Train

Test

Result

Reference

Files

README.md

Latest commit

History

README.md

File metadata and controls

Denpendency

Train

Test

Result

Reference