Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Report] --checkpoint broken #292

Closed
2 tasks done
VladimirFokow opened this issue Mar 15, 2024 · 2 comments · May be fixed by #313
Closed
2 tasks done

[Bug Report] --checkpoint broken #292

VladimirFokow opened this issue Mar 15, 2024 · 2 comments · May be fixed by #313
Labels
documentation Improvements or additions to documentation

Comments

@VladimirFokow
Copy link
Contributor

VladimirFokow commented Mar 15, 2024

Describe the bug

Here: https://isaac-orbit.github.io/orbit/source/setup/sample.html#reinforcement-learning

In this example:

./orbit.sh -p source/standalone/workflows/rsl_rl/play.py --task Isaac-Reach-Franka-v0 --num_envs 32 --checkpoint /PATH/TO/model.pth

we are supposed to provide /PATH/TO/model.pth

  • First of all, in my logs, the models with .pt extension are saved, not .pth

  • Secondly, it would be nice to write in the docs how to find this path, so that there's no need for the user to go looking for the logic into the code. (is the path to model the same as path to logs, described on that page?)

  • Thirdly, most importantly, it's completely broken even if you provide the path:

---------------------------------
Traceback (most recent call last):
  File "/home/fokow/work/Orbit/source/standalone/workflows/rsl_rl/play.py", line 106, in <module>
    main()
  File "/home/fokow/work/Orbit/source/standalone/workflows/rsl_rl/play.py", line 74, in main
    resume_path = get_checkpoint_path(log_root_path, agent_cfg.load_run, agent_cfg.load_checkpoint)
  File "/home/fokow/work/Orbit/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/utils/parse_cfg.py", line 209, in get_checkpoint_path
    raise ValueError(f"No checkpoints in the directory: '{run_path}' match '{checkpoint}'.")
ValueError: No checkpoints in the directory: '/home/fokow/work/Orbit/logs/rsl_rl/franka_reach/2024-03-15_21-08-21' match '/home/fokow/work/Orbit/logs/rsl_rl/franka_reach/2024-03-15_21-08-21'.

In this line the checkpoint is the full path of the checkpoint that was provided, that's why no matches are found.

System Info

Checklist

  • I have checked that there is no similar issue in the repo (required)
  • I have checked that the issue is not in running Isaac Sim itself and is related to the repo

Related issue was discussed here, where they said:

Documentation is not the same as code. It does get a bit confusing.

Acceptance Criteria

This task is considered done if:

  • the function get_checkpoint_path works
  • the examples in the docs work (that use the --checkpoint flag)
  • the docs describe where to find this path
@VladimirFokow
Copy link
Contributor Author

VladimirFokow commented Mar 16, 2024

What should the logic be:

  • the latest checkpoint automatically selected,
  • or exactly the one specified by the user?

the code attempts to do both 😑 and achieves none.

@Mayankm96
Copy link
Contributor

  1. You are right about pth. That is an error.
  2. You need to specify the "load_run" and the "checkpoint" file (as the args say). The paths are resolved automatically
./orbit.sh -p source/standalone/workflows/rsl_rl/play.py --task Isaac-Rough-Anymal-C-Play-v0 --load_run 2024-03-11_16-11-38 --checkpoint model_300.pt

I agree. This is not very intuitive. Feel free to send an MR with the required clarifications :)

@Mayankm96 Mayankm96 added the documentation Improvements or additions to documentation label Mar 19, 2024
ADebor pushed a commit to ADebor/IsaacLab that referenced this issue Apr 8, 2024
This MR makes the following changes:

* Re-organizes the docs to have somewhat a better story
* Renames all the scripts to start with "VERB_ACTION.py" -- Previous
namings like articulation.py was causing file-search conflicts

- This change requires a documentation update

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./orbit.sh --format`
- [x] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there
fatimaanes pushed a commit to fatimaanes/omniperf that referenced this issue Aug 8, 2024
This MR makes the following changes:

* Re-organizes the docs to have somewhat a better story
* Renames all the scripts to start with "VERB_ACTION.py" -- Previous
namings like articulation.py was causing file-search conflicts

- This change requires a documentation update

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./orbit.sh --format`
- [x] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants