Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About other scripts in src #2

Open
XiaobenLi00 opened this issue Sep 12, 2024 · 3 comments
Open

About other scripts in src #2

XiaobenLi00 opened this issue Sep 12, 2024 · 3 comments

Comments

@XiaobenLi00
Copy link

XiaobenLi00 commented Sep 12, 2024

Hi,

Thanks a lot for your sharing.

I found the main_challenge_manipulation_phase2.py and test_submission.py very useful, and I am wondering how the results in output/trained_agents are produced, I also want to know what are the roles of other scripts in src/.

Thanks again and look forward to your answer!

@albertochiappa
Copy link
Contributor

Hi, sorry for the late reply. main_challenge.py is the training script we used for the locomotion track of the myochallenge, which unfortunately did not produce very good results. agent_mani_lattice.py is the script that was used for deployment in the evaluation platform of the competition. main_dataset_recurrent_ppo.py is used to evaluate a trained policy and to store the results in a dataframe. test_submission.py also tests a trained agent, but wrapping it in the interface used in the online evaluation platform. We used it to see if the code actually worked. Regarding running the trainings to get the trained models, I guess it's almost impossible to get the exact same agents, as you would need to run the code with 250 environments in the same cluster we used. However, you can run a lighter training (e.g. 20 envs) also on a standard workstation.

To run a training, you can run the script main_challenge_manipulation_phase2.py or the script docker/train.sh. By default, train.sh resumes the training from the penultimate curriculum step (step 9). You can make it start from scratch if you do not pass the argument "load_path". In this case, however, it will start a training with the final environment configuration and will likely not learn anything. To start a training with the initial configuration, you need to pass the same configuration parameters that are stored in output/trained_agents/curriculum_step_1/args.json. They are many, but most of them are the default parameters, so no need to pass all of them explicitly.

@XiaobenLi00
Copy link
Author

Thanks a lot for your detailed answers!

I know you run 250 environments in a cluster and run 100 steps for each environments, resulting 25000 steps for each update. Is it possible if I use smaller num of environments, e.g., 50, and increase the num of steps to get a same num of steps, i.e., 25000? Would this configs produce same or similar results?

By the way, in the README, there is a description:

Please note that throughout the curriculum we also made small modifications to the environment, which break the compatibility of the solutions up to step 6 with the final environment of phase 2.

I am wondering how to deal with this incompatibility? Does that mean in step 6 the environment is modified? And then how should I train for step 7 and followers?

Thanks again for your kind answers!

@albertochiappa
Copy link
Contributor

I would just use fewer environment and expect the results not to change too much, keeping the same number of steps.

We stored a copy of the environment file used for each training phase in the checkpoint folder, you can use that one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants