Skip to content

Commit b10f4c1

Browse files
authored
Update README.md
1 parent 9e4c7da commit b10f4c1

File tree

1 file changed

+9
-2
lines changed

1 file changed

+9
-2
lines changed

README.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,9 @@
11
# rlbook
2+
3+
![ci-cd](https://github.com/joseph-jnl/rlbook/actions/workflows/ci-cd.yml/badge.svg)
4+
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
5+
6+
27
Code for my walkthrough of: *Reinforcement Learning An Introduction by Richard Sutton and Andrew Barto* (http://incompleteideas.net/book/the-book.html)
38

49
## Setup
@@ -25,7 +30,7 @@ Run commands using the rlbook environment via uv:
2530
```bash
2631
uv run run.py
2732
```
28-
or by first activating the rlbbok venv:
33+
or by first activating the rlbook venv (this is my preferred workflow):
2934
```bash
3035
source ./venv/bin/activate
3136
```
@@ -43,7 +48,7 @@ Login to wandb via:
4348
## Quickstart
4449
Algorithm implementations are located in the `/src` directory while the scaffolding code/notebooks for recreating/exploring Sutton & Barto are segmented into the `experiments/` directory.
4550

46-
e.g. for recreating Figure 2.3, navigate to /experiments/ch2_bandits/ and run:
51+
e.g. for recreating Figure 2.3, navigate to `/experiments/ch2_bandits/` and run:
4752
```bash
4853
python run.py -m run.steps=1000 run.n_runs=2000 +bandit.epsilon=0,0.01,0.1 +bandit.random_argmax=true experiment.tag=fig2.2 experiment.upload=true
4954
```
@@ -52,6 +57,8 @@ python run.py -m run.steps=1000 run.n_runs=2000 +bandit.epsilon=0,0.01,0.1 +band
5257
Figure 2.3 (rlbook): The `+bandit.random_argmax=true` flag was used to switch over to an argmax implementation that randomizes between tiebreakers rather than first occurence used in the default numpy implementation to better align with the original example.
5358
[Link to wandb artifact](https://api.wandb.ai/links/josephjnl/53gxgbcc)
5459

60+
Further details on experimental setup and results can be found at corresponding chapter README's.
61+
5562
## Chapter Links
5663

5764
- [Chapter 2: Multi-armed Bandits](https://github.com/joseph-jnl/rlbook/tree/dev/experiments/ch2_bandits)

0 commit comments

Comments
 (0)