Model-based:
Markov Decision Process Model, Policy Iteration, Policy Improvement, Value Iteration Algorithm, and Maze MDP Example
Model-free:
monte carlo method, epsilon-greedy policy exploration method, on-policy and off-policy
Model-free:
temporal difference policy evaluation, greedy policy exploration SARSA, Qlearning and SARSA()