Skip to content
/ RL Public

Reinforcement Learning Algorithms implemented by Erlang. This repository is mainly for the demonstration of the Reinforcement Learning methods introduced in the book: "Reinforcement Learning: An Introduction" by Sutton and Barto. Most of the codes are chiefly for instructive purpose but not guaranteed to be good for practical applications.

Notifications You must be signed in to change notification settings

barcoyou/RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Reinforcement Learning
--------------------------------------------------------------------------------
This repository mainly demonstrates the Reinforcement Learning algorithms
introduced in the book: "Reinforcement Learning: An Introduction" by
Sutton and Barto. For more information about this invaluable book, please
visit http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=7548.

All the codes here are implemented in Erlang, and mainly for instructive
purposes, but not guaranteed to be good for practical applications.
---------------------------------------------------------------------------------
Usage:
cd RL
erl -make
erl -pa ebin

>greedy_bandit:run(10,2000,0.1,"egreedy_0.1.dat").
%this will generate a data file "egreedy_0.1.dat" recording the play number,
%average reward got by the agent and the percentage of the optimal action selected.
%To generate a figure like Fig. 2.1 in the book, run the gnuplot command:
%set xlabel "Plays", set ylabel "Average Rewards",
%plot "egreedy_0.1.dat" u 1:2 t "Epsilon Greedy Method with Epsilon=0.1" w l
%set xlabel "Plays", set ylabel "Percent of Optimal Action",
%plot "egreedy_0.1.dat" u 1:3 t "Epsilon Greedy Method with Epsilon=0.1" w l
%
%For the meaning of the arguments please see the source file "greedy_bandit.erl"

>softmax_bandit:run(10, 2000, 2, "softmax_2.dat").
%This will simulate the agent which uses softmax to solve the n-Armed Bandit problem

....

------------------------------------------------------------------------------------
moduel gridworld.erl is an example in the chapter 3 of the book, illuminating the
Bellman equations for state-value function and optimal state-value function.
To use it:
>gridworld:run().  %To get the state-values
>gridworld:run_optimal() % To get the optimal state-values

------------------------------------------------------------------------------------
chapter 4

module policy_eva.erl is an example of iterative policy evaluation.
>policy_eva:start().
>policy_eva:run(equi_prob).  %To evaluate the policy of equiprobably selecting actions
>policy_eva:pause(). %To pause the ongoing evaluation
>policy_eva:run(optimal). %To evaluate the optimal policy
>policy_eva:stop(). %To stop the process.

module car_rental.erl is an example of the Policy Iteration
>car_rental:start().
>car_rental:run()  %To start the policy iteration process
>car_rental:print()  % To see the final policy and value fuctions
>car_rental:stop(). %To stop the state machine

module gambler.erl is an example of Value Iteration
>gambler:start() % to start the Value Iteration process
>gambler:print(). % to print out the final policy and value function
>gambler:stop(). %to stop the state machine


--------------------------------------------------------------------------------------
Chapter 5

moduel mc_blackjack.erl simulates the Monte Carlo method solving the balckjack problem
>mc_blackjack:run(5000). %Sample 5000 episodes

moduel es_blackjak.erl simulates the ES Monte Carlo method solving the blackjack problem
>es_blackjack:run(5000). %Sample 5000 episodes with Exploring Starting

About

Reinforcement Learning Algorithms implemented by Erlang. This repository is mainly for the demonstration of the Reinforcement Learning methods introduced in the book: "Reinforcement Learning: An Introduction" by Sutton and Barto. Most of the codes are chiefly for instructive purpose but not guaranteed to be good for practical applications.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages