Practical_RL/week06_policy_based/README.md at master · yandexdataschool/Practical_RL · GitHub

Materials

Slides
Video lecture by D. Silver - video
Our lecture, seminar(pytorch)
Alternative lecture by J. Schulman part 1 - video
Alternative lecture by J. Schulman part 2 - video
Andrej Karpathy's post on policy gradients

More materials

Actually proving the policy gradient for discounted rewards - article
On variance of policy gradient and optimal baselines: article, another article
Learn Advatangeg Actor Critic with a comic
Generalizing log-derivative trick - url
Combining policy gradient and q-learning - arxiv
Variational perspective on reinforcement learning (from DeepBayes) - pdf
Adversarial review of policy gradient - blog

Run seminar notebook in Colab:

Run optional homework notebook in Colab: