Enabling periodic evaluation #202
Unanswered
elle-miller
asked this question in
Q&A
Replies: 1 comment 1 reply
-
@Toni-SM just fyi I resolved my initial issues from when I opened this discussion 2 weeks ago, and have edited the post to reflect current working state. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi there,
I am proposing SKRL should have a way to periodically evaluate the agent during training. For example use
SequentialTrainer
in an alternatingtrain()->eval()->train()
fashion. The provided examples only show evaluation post-training: https://skrl.readthedocs.io/en/latest/api/trainers/sequential.htmlHere is an example of an agent in the Isaac Lab Cartpole environment. You can see that the evaluation returns are communicating the true learning state of the agent, without the stochasticity of the sampled actions. I was always confused by how the performance would degrade/oscillate in
Rewards/Total reward (mean)
throughout training, but this would fix that.The code needed for this change & to reproduce plots below is here: https://github.com/elle-miller/skrl_testing
In this example, I train
num_envs
for 1000 timesteps each, and evaluate 10 times throughout the process. This means training 100 timesteps, evaluate, and repeat x10.Code modifications
train()
function to reset the memory and rollout counter:act()
function in PPO to only return the mean action under evaluation instead of sampling.eval()
methodLet me know what you think - I can make a PR request if you want to integrate this.
Beta Was this translation helpful? Give feedback.
All reactions