Skip to content

Commit cec84cc

Browse files
vwxyzjndosssman
andauthored
Refactor value based methods (vwxyzjn#102)
* Refactor value based methods * fix test cases * fix test cases * refactor ddpg * quick fix * refactor td3 * Update lock files * Quick fix * Fix learning rate and target-network-frequency * fix learning rate * Reproduce past results: epsilon and log clipping * Fix learning rate * Quick fix * quick update * Quick update * Refactor value tweaks for TD3 and SAC (vwxyzjn#106) * Fixed 'optimize the midel' typo in all files * Fixed 'optimize the midel' typo in offline scripts too * TD3: removed DDPG's update code from the training loop * Refactored sac_continuous, with preliminary tests working Co-authored-by: Rousslan F.J. Dossa <[email protected]>
1 parent ed92446 commit cec84cc

13 files changed

+3493
-3980
lines changed

cleanrl/apex_dqn_atari.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -853,7 +853,7 @@ def learn(args, rb, global_step, data_process_queue, data_process_back_queues, s
853853

854854
stats_queue.put(("losses/td_loss", loss.item(), update_step+args.learning_starts))
855855

856-
# optimize the midel
856+
# optimize the model
857857
optimizer.zero_grad()
858858
loss.backward()
859859
nn.utils.clip_grad_norm_(list(learn_q_network.parameters()), args.max_grad_norm)

cleanrl/c51.py

Lines changed: 202 additions & 223 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)