The combination of Reinforcement Learning and Deep Learning produces a series of important algorithms. This project will focus on referring to relevant papers and implementing relevant algorithms as far as possible.
This repo aims to implement Deep Reinforcement Learning algorithms using Pytorch and Tensorflow 2.
- Implementing all of this algorithms from scratch really helps you with your parameter tuning;
- The coding process allows you to better understand the principles of the algorithm.
Value based algorithms include DQNs.
[1]. DQN Pytorch / Tensorflow, Paper: Playing Atari with Deep Reinforcement Learning
[2]. Double DQN Pytorch / Tensorflow, Paper: Deep Reinforcement Learning with Double Q-learning
[3]. Dueling DQN Pytorch / Tensorflow, Paper: Dueling Network Architectures for Deep Reinforcement Learning
Policy based algorithms is currently perform better, including Policy Gradient Methods.
[1]. REINFORCE Pytorch / Tensorflow, Paper: Policy Gradient Methods for Reinforcement Learning with Function Approximation
[2]. VPG(Vanilla Policy Gradient) Pytorch / Tensorflow, Paper: High Dimensional Continuous Control Using Generalized Advantage Estimation
[3]. A2C Pytorch, Paper: Asynchronous Methods for Deep Reinforcement Learning Synchronous version of A3C
[4]. DDPG Pytorch, Paper: Continuous Control With Deep Reinforcement Learning
[5]. TRPO Pytorch / Tensorflow, Paper: Trust Region Policy Optimization
[6]. PPO Pytorch / Tensorflow, Paper: Proximal Policy Optimization Algorithms
[7]. SAC Pytorch, Paper: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
[8]. SAC with Automatically Adjusted Temperature Pytorch, Paper: Soft Actor-Critic Algorithms and Applications
[9]. TD3(Twin Delayed DDPG) Pytorch, Paper: Addressing Function Approximation Error in Actor-Critic Methods
Imitation learning learn from expert data.
[1]. GAIL Pytorch, Paper: Generative Adversarial Imitation Learning
- Python >=3.6
- Tensorflow >= 2.4.0
- Pytorch >= 1.5.0
- Seaborn >= 0.10.0
- Click >= 7.0
Full dependencies are listed in the requirements.txt file, install with pip:
pip install -r requirements.txt
You can install the project by typing the following command:
python install -e .
Each algorithm is implemented in a single folder including 4
files:
1. main.py # A minimal executable example for algorithm
2. [algorithm].py # Main body for algorithm implementation
3. [algorithm]_step.py # Algorithm update core step
4. test.py # Loading pretrained model and test performance of the algorithm
The default main.py
is a an executable example, the parameters are parsed by click.
You can run algorithm from the main.py
or bash scripts
.
- You can simply type
python main.py --help
in the algorithm package to view all configurable parameters. - The directory Scripts gives some bash scripts, you can modify them at will.
Utils/plot_util.py provide a simple plot tool based on Seaborn
and Matplotlib
.
All the plots in this project are drawn by this plot util.
Currently only VPG
, PPO
and TRPO
Available: