Skip to content

gianmtedeschi/learnRL

Repository files navigation

Set up the environment.

You need an anaconda3 environment with python 3.9

conda create --name name python=3.9
conda activate name

Install the packages.

pip3 install -r requirements.txt

Run experiments.

All you need is in run.py, which requires several parameters:

  • "--dir": specifies the directory in which will be saved the results;
  • "--ite": how many iterations the algorithm must do;
  • "--alg": the algorithm to run, you can select "pg" or "split";
  • "--estimator": specifies which estimator to use;
  • "--std": the exploration amount, it is $\sigma^2$;
  • "--pol": the policy to use, you can select "gaussian" or "split_gaussian";
  • "--env": the environment on which the learning has to be done, you can select "swimmer", "half_cheetah", "ant", "lq", "minigolf";
  • "--horizon": set the horizon of the problem;
  • "--gamma": set the discount factor of the problem;
  • "--lr": set the step size;
  • "--lr_strategy": set the learning rate schedule, you can select "constant" or "adam";
  • "--batch": specifies how many trajectories are evaluated in each iteration;
  • "--clip": specifies whether to apply action clipping, you can select "0" or "1";
  • "--n_trials": specifies how many run of the same experiments has to be done.
  • "--verbose": print debug information.
  • "--baseline": specifies which baselinse adopt

Only for the GAPS algorithm:

  • "--alpha": specifies the alpha parameter for the split check criteria;
  • "--max_splits": specifies the maximum number of split that can be performed;

Only for the LQR environment:

  • "--lq_state_dim": specifies the state dimension for the LQR environment;
  • "--lq_action_dim": specifies the action dimension for the LQR environment;

Here is an example running PG on Swimmer:

python3 run.py --dir /your/path --alg pg --ite 100 --std 1 --pol linear --env swimmer --horizon 100 --gamma 1 --lr 0.1 --lr_strategy adam --clip 1 --batch 30 --n_trials 1

About

A library to use and learn reinforcement learning algorithms (and more) from scratch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors