Skip to content

Latest commit

 

History

History

clipped_ppo

Clipped PPO

Each experiment uses 3 seeds and is trained for 10M environment steps. The parameters used for Clipped PPO are the same parameters as described in the original paper.

Inverted Pendulum Clipped PPO - single worker

coach -p Mujoco_ClippedPPO -lvl inverted_pendulum

Inverted Pendulum Clipped PPO

Inverted Double Pendulum Clipped PPO - single worker

coach -p Mujoco_ClippedPPO -lvl inverted_double_pendulum

Inverted Double Pendulum Clipped PPO

Reacher Clipped PPO - single worker

coach -p Mujoco_ClippedPPO -lvl reacher

Reacher Clipped PPO

Hopper Clipped PPO - single worker

coach -p Mujoco_ClippedPPO -lvl hopper

Hopper Clipped PPO

Half Cheetah Clipped PPO - single worker

coach -p Mujoco_ClippedPPO -lvl half_cheetah

Half Cheetah Clipped PPO

Walker 2D Clipped PPO - single worker

coach -p Mujoco_ClippedPPO -lvl walker2d

Walker 2D Clipped PPO

Ant Clipped PPO - single worker

coach -p Mujoco_ClippedPPO -lvl ant

Ant Clipped PPO

Swimmer Clipped PPO - single worker

coach -p Mujoco_ClippedPPO -lvl swimmer

Swimmer Clipped PPO

Humanoid Clipped PPO - single worker

coach -p Mujoco_ClippedPPO -lvl humanoid

Humanoid Clipped PPO