Slides - here
- [main] David Silver lecture on exploration and expoitation - video
- Alternative lecture by J. Schulman - video
- Alternative lecture by N. de Freitas (with bayesian opt) - video
- Our lectures (russian)
-
Gittins Index - the less heuristical approach to bandit exploration - article
-
"Deep" version: variational information maximizing exploration - video
- Same topics in russian - video
-
Lecture covering intrinsically motivated reinforcement learning - video
-
Very interesting blog post written by Lilian Weng that summarises this week's materials: The Multi-Armed Bandit Problem and Its Solutions
In this seminar, you'll be solving basic and contextual bandits with uncertainty-based exploration like Bayesian UCB and Thompson Sampling. You will also get acquainted with Bayesian Neural Networks.
Everything else is in the notebook :)