[Question] PBT config objective definition #3671

SkoveBC · 2025-10-06T21:42:27Z

SkoveBC
Oct 6, 2025

Question

I want to use the newly added HPO method Population Based Training for my task and set the objective to the total amount of rewards over time for one episode, which is logged in rl_games with the metrics rewards/time as can be seen here in the example:

IsaacLab/scripts/reinforcement_learning/ray/tuner.py

Line 364 in b7004f4

    
           parser.add_argument("--metric", type=str, default="rewards/time", help="What metric to tune for.")

I have changed the objective parameter which is located in the following part of the config:

IsaacLab/source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/dexsuite/config/kuka_allegro/agents/rl_games_ppo_cfg.yaml

Lines 88 to 94 in b7004f4

    
           pbt: 
        
             enabled: False 
        
             policy_idx: 0  # policy index in a population 
        
             num_policies: 8  # total number of policies in the population 
        
             directory: . 
        
             workspace: "pbt_workspace"  # suffix of the workspace dir name inside train_dir 
        
             objective: episode.Curriculum/adr

I do get the following error though if I start the training with objective: rewards/time:

Error executing job with overrides: ['agent.pbt.enabled=True', 'agent.pbt.num_policies=4', 'agent.pbt.policy_idx=0']
Traceback (most recent call last):
  File "/workspace/isaaclab/source/isaaclab_tasks/isaaclab_tasks/utils/hydra.py", line 101, in hydra_main
    func(env_cfg, agent_cfg, *args, **kwargs)
  File "/workspace/isaaclab/scripts/reinforcement_learning/rl_games/train.py", line 239, in main
    runner.run({"train": True, "play": False, "sigma": train_sigma})
  File "/workspace/isaaclab/_isaac_sim/kit/python/lib/python3.11/site-packages/rl_games/torch_runner.py", line 178, in run
    self.run_train(args)
  File "/workspace/isaaclab/_isaac_sim/kit/python/lib/python3.11/site-packages/rl_games/torch_runner.py", line 149, in run_train
    agent.train()
  File "/workspace/isaaclab/_isaac_sim/kit/python/lib/python3.11/site-packages/rl_games/common/a2c_common.py", line 1351, in train
    step_time, play_time, update_time, sum_time, a_losses, c_losses, b_losses, entropies, kls, last_lr, lr_mul = self.train_epoch()
                                                                                                                 ^^^^^^^^^^^^^^^^^^
  File "/workspace/isaaclab/_isaac_sim/kit/python/lib/python3.11/site-packages/rl_games/common/a2c_common.py", line 1207, in train_epoch
    batch_dict = self.play_steps()
                 ^^^^^^^^^^^^^^^^^
  File "/workspace/isaaclab/_isaac_sim/kit/python/lib/python3.11/site-packages/rl_games/common/a2c_common.py", line 792, in play_steps
    self.algo_observer.process_infos(infos, env_done_indices)
  File "/workspace/isaaclab/source/isaaclab_rl/isaaclab_rl/rl_games/pbt/pbt.py", line 259, in process_infos
    self._call_multi("process_infos", infos, done_indices)
  File "/workspace/isaaclab/source/isaaclab_rl/isaaclab_rl/rl_games/pbt/pbt.py", line 250, in _call_multi
    getattr(o, method)(*args_, **kwargs_)
  File "/workspace/isaaclab/source/isaaclab_rl/isaaclab_rl/rl_games/pbt/pbt.py", line 75, in process_infos
    score = score[part]
            ~~~~~^^^^^^
KeyError: 'rewards/time'

How to set the objective to the corresponding metrics of rewards/time and how in general can I access other terms besides episode.Curriculum/adr from the example config?

@ooctipus, maybe you can help out, since I have seen that you have made the commit for this functionality.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] PBT config objective definition #3671

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[Question] PBT config objective definition #3671

Uh oh!

Uh oh!

SkoveBC Oct 6, 2025

Question

Replies: 0 comments

SkoveBC
Oct 6, 2025