rl-mdatos

This repository contains my final project for the Data Mining subject — Minería de Datos in Spanish, that's why mdatos, taught in the Master's Degree In Systems And Control Engineering at UNED (Universidad Nacional de Educación a Distancia) and UCM (Universidad Complutense de Madrid), from Spain.

It is an implementation of several tabular Reinforcement Learning algorithms, which are then applied to OpenAI Gym environments. The algorithms and environments implemented are the following:

Environment	Sarsa	Q-Learning	n-step Sarsa	Dyna-Q
NChain-v0	✔️	✔️	✔️	✔️
FrozenLake-v0	✔️	✔️	✔️	✔️
CartPole-v0	✔️	✔️	✔️	✖️
MountainCar-v0	✔️	✔️	✔️	✖️

The goal of this repo is purely educational:

For more elaborated and complicated RL algorithms, see cleanrl.
For an intuitive, easy-to-use library widely used in research, see stable-baselines3 and rl-baselines3-zoo.

A Jupyter Notebook written in Spanish that provides basic explanations of RL concepts making use of this repo can be found here.

The bibliography I used is probably the most common entry point if you want to learn Reinforcement Learning.

How to use this repo

In order to train and evaluate the agents in this repo, follow these steps:

Create and activate a virtual environment:

$ cd rl-mdatos
$ virtualenv .venv
$ source .venv/bin/activate

Install the required packages:

$ (.venv) pip install -r requirements.txt

Install this very repo in editable mode:

$ (.venv) pip install -e .

Go to the desired environment. For each environment, there's a script to train, execute and/or record a specific algorithm:

$ (.venv) cd rl_mdatos/envs/desired_env

To train a Q-Learning agent in CartPole-v0:

$ (.venv) python cp_q_learning.py --train

To execute the trained agent:

$ (.venv) python cp_q_learning.py --run

To record the execution (this only works for CartPole-v0 and MountainCar-v0):

$ (.venv) python cp_q_learning.py --run --record

3 types of files are stored in rl-mdatos/data:

logs: data generated during training, which can be visualized with tensorboard (tensorboard --logdir data/...)
trained_agents: files with final parameters of the trained agents, which are loaded at execution time.
videos: videos of the recorded episodes.

Output

After successfully training the agents, these should be the results.

NChain-v0

INFO:root:Running Q-Learning agent
INFO:root:Episode 1
INFO:root:Total reward: 9960
INFO:root:Mean reward: 9.96

FrozenLake-v0

CartPole-v0

MountainCar-v0

Bibliography

[1] Richard S. Sutton and Andrew G. Barto. Reinforcement learning: An introduction. MIT press, 2018.

[2] David Silver. Lectures on Reinforcement Learning. URL:https://www.davidsilver.uk/teaching/. 2015.

[3] Stuart J. Russell and Peter Norvig. Artificial Intelligence - A Modern Approach, Third International Edition. Pearson Education London, 2010.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

rl-mdatos

How to use this repo

Output

NChain-v0

FrozenLake-v0

CartPole-v0

MountainCar-v0

Bibliography

Files

README.md

Latest commit

History

README.md

File metadata and controls

rl-mdatos

How to use this repo

Output

NChain-v0

FrozenLake-v0

CartPole-v0

MountainCar-v0

Bibliography