General framework of reinforcement learning algorithms. Implemented for experiments of AILab, Peking University.
- Universal
- Easy to use / edit
- Documents
- DDPG (Deep Deterministic Policy Gradient), ICLR 2016 (https://arxiv.org/abs/1509.02971)
- DQN (Deep Q-Network), Nature 2015 (https://www.nature.com/articles/nature14236
- TRPO (Trust Region Policy Optimization), ICML 2015 (https://arxiv.org/abs/1502.05477)
- PPO (Proximal Policy Optimization), arXiv 2017 (https://arxiv.org/abs/1707.06347)
- some other algorithms...
- Try to design an easy-to-use mode for lab's experimental needs
- Evaluate the PyTorch & TensorFlow implement of DDPG, and create an easy document for DDPG
- Implement basic DQN published on Nature
- READ TRPO and PPO BEFORE implement them
- Think about other algorithms need to be added in this repo
- some other todos...