Update README.md

joseph-jnl · web-flow · commit b10f4c1ee6cd · 2024-11-17T09:33:16.000-05:00
diff --git a/README.md b/README.md
@@ -1,4 +1,9 @@
 # rlbook
+
+![ci-cd](https://github.com/joseph-jnl/rlbook/actions/workflows/ci-cd.yml/badge.svg)
+[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
+
+
 Code for my walkthrough of: *Reinforcement Learning An Introduction by Richard Sutton and Andrew Barto* (http://incompleteideas.net/book/the-book.html) 
 
 ## Setup
@@ -25,7 +30,7 @@ Run commands using the rlbook environment via uv:
 ```bash
 uv run run.py
 ```
-or by first activating the rlbbok venv:
+or by first activating the rlbook venv (this is my preferred workflow):
 ```bash
 source ./venv/bin/activate
 ```
@@ -43,7 +48,7 @@ Login to wandb via:
 ## Quickstart
 Algorithm implementations are located in the `/src` directory while the scaffolding code/notebooks for recreating/exploring Sutton & Barto are segmented into the `experiments/` directory.  
 
-e.g. for recreating Figure 2.3, navigate to /experiments/ch2_bandits/ and run:
+e.g. for recreating Figure 2.3, navigate to `/experiments/ch2_bandits/` and run:
 ```bash
 python run.py -m run.steps=1000 run.n_runs=2000 +bandit.epsilon=0,0.01,0.1 +bandit.random_argmax=true experiment.tag=fig2.2 experiment.upload=true
 ```
@@ -52,6 +57,8 @@ python run.py -m run.steps=1000 run.n_runs=2000 +bandit.epsilon=0,0.01,0.1 +band
 Figure 2.3 (rlbook): The `+bandit.random_argmax=true` flag was used to switch over to an argmax implementation that randomizes between tiebreakers rather than first occurence used in the default numpy implementation to better align with the original example.
 [Link to wandb artifact](https://api.wandb.ai/links/josephjnl/53gxgbcc)
 
+Further details on experimental setup and results can be found at corresponding chapter README's.
+
 ## Chapter Links
 
 - [Chapter 2: Multi-armed Bandits](https://github.com/joseph-jnl/rlbook/tree/dev/experiments/ch2_bandits)