About other scripts in src #2

XiaobenLi00 · 2024-09-12T16:10:33Z

Hi,

Thanks a lot for your sharing.

I found the main_challenge_manipulation_phase2.py and test_submission.py very useful, and I am wondering how the results in output/trained_agents are produced, I also want to know what are the roles of other scripts in src/.

Thanks again and look forward to your answer!

The text was updated successfully, but these errors were encountered:

albertochiappa · 2024-09-23T06:04:05Z

Hi, sorry for the late reply. main_challenge.py is the training script we used for the locomotion track of the myochallenge, which unfortunately did not produce very good results. agent_mani_lattice.py is the script that was used for deployment in the evaluation platform of the competition. main_dataset_recurrent_ppo.py is used to evaluate a trained policy and to store the results in a dataframe. test_submission.py also tests a trained agent, but wrapping it in the interface used in the online evaluation platform. We used it to see if the code actually worked. Regarding running the trainings to get the trained models, I guess it's almost impossible to get the exact same agents, as you would need to run the code with 250 environments in the same cluster we used. However, you can run a lighter training (e.g. 20 envs) also on a standard workstation.

To run a training, you can run the script main_challenge_manipulation_phase2.py or the script docker/train.sh. By default, train.sh resumes the training from the penultimate curriculum step (step 9). You can make it start from scratch if you do not pass the argument "load_path". In this case, however, it will start a training with the final environment configuration and will likely not learn anything. To start a training with the initial configuration, you need to pass the same configuration parameters that are stored in output/trained_agents/curriculum_step_1/args.json. They are many, but most of them are the default parameters, so no need to pass all of them explicitly.

XiaobenLi00 · 2024-09-25T02:00:23Z

Thanks a lot for your detailed answers!

I know you run 250 environments in a cluster and run 100 steps for each environments, resulting 25000 steps for each update. Is it possible if I use smaller num of environments, e.g., 50, and increase the num of steps to get a same num of steps, i.e., 25000? Would this configs produce same or similar results?

By the way, in the README, there is a description:

Please note that throughout the curriculum we also made small modifications to the environment, which break the compatibility of the solutions up to step 6 with the final environment of phase 2.

I am wondering how to deal with this incompatibility? Does that mean in step 6 the environment is modified? And then how should I train for step 7 and followers?

Thanks again for your kind answers!

albertochiappa · 2024-11-16T17:39:23Z

I would just use fewer environment and expect the results not to change too much, keeping the same number of steps.

We stored a copy of the environment file used for each training phase in the checkpoint folder, you can use that one.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About other scripts in src #2

About other scripts in src #2

XiaobenLi00 commented Sep 12, 2024 •

edited

Loading

albertochiappa commented Sep 23, 2024

XiaobenLi00 commented Sep 25, 2024

albertochiappa commented Nov 16, 2024

About other scripts in src #2

About other scripts in src #2

Comments

XiaobenLi00 commented Sep 12, 2024 • edited Loading

albertochiappa commented Sep 23, 2024

XiaobenLi00 commented Sep 25, 2024

albertochiappa commented Nov 16, 2024

XiaobenLi00 commented Sep 12, 2024 •

edited

Loading