Hybrid deep reinforcement learning: combine the best of gradient-based and gradient-free methods (NYU Shanghai DURF 2018)

This repository features my research project on deep reinforcement learning in my sophomore year at NYU Shanghai (advised by Prof. Keith Ross, supported by NYU Shanghai Dean's Undergraduate Research Fund). In this project, I experimented with combining Policy Gradient methods, including vanilla Policy Gradient (aka REINFORCE), Actor-Critic, and Proximal Policy Optimization (PPO) with Evolution Strategies to develop a hybrid algorithm with improved sample efficiency. Performances of the proposed algorithms were evaluated on MuJoCo benchmarks.

References:

REINFORCE: Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3-4):229–256, 1992.
Actor-Critic: Richard S Sutton, David A McAllester, Satinder P Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems, pages 1057–1063, 2000.
PPO: John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
Evolution Strategy: Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, and Ilya Sutskever. Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864, 2017.
MuJoCo: Emanuel Todorov, Tom Erez, Yuval Tassa. MuJoCo: A physics engine for model-based control. https://ieeexplore.ieee.org/document/6386109

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
es-ppo		es-ppo
.gitignore		.gitignore
CartPole.py		CartPole.py
HalfCheetah_AC.py		HalfCheetah_AC.py
HalfCheetah_vanilla.py		HalfCheetah_vanilla.py
InvertedPendulum.py		InvertedPendulum.py
README.md		README.md
cartpole_es.py		cartpole_es.py
inverted_es_simple.py		inverted_es_simple.py
report_ziyu.pdf		report_ziyu.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hybrid deep reinforcement learning: combine the best of gradient-based and gradient-free methods (NYU Shanghai DURF 2018)

About

Releases

Packages

Languages

ziyulu-uw/DRL-2018

Folders and files

Latest commit

History

Repository files navigation

Hybrid deep reinforcement learning: combine the best of gradient-based and gradient-free methods (NYU Shanghai DURF 2018)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages