Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail to put multiple persons in the same world frame #29

Open
hongsukchoi opened this issue Oct 10, 2024 · 3 comments
Open

Fail to put multiple persons in the same world frame #29

hongsukchoi opened this issue Oct 10, 2024 · 3 comments

Comments

@hongsukchoi
Copy link

hongsukchoi commented Oct 10, 2024

Hi @zehongs @pengsida ,

Thank you for your continuous great works! I have a question about GVHMR.

How can I visualize the global camera trajectories (sequential 6D camera poses), when there are mutiple persons?

If I am right, GVHMR's outputs (camera pose estimation and human's global trajectories) have no explicit relation with SLAM camera pose predictions. Also, GVHMR estimates its own camera poses for the cropped image per single person. I think the gravity coordinate frame is also defined per person. So I had to stitch the camera trajectories with some heuristics to deal with multiple people appearing and disappearing in the video.

As a result, I get this kind of weird result from a PoseTrack's multi-person video. I want to know whether I am doing something wrong, or there's a better way of getting camera trajectory visualization.

Screenshot 2024-10-09 at 5 02 40 PM Screenshot 2024-10-09 at 5 02 47 PM

Caution: These images are not exactly time synchronized.
Full video:
https://youtu.be/qwUeOn0HieI
https://youtu.be/vN45gpskOuI

I used Viser for 3D visualization, since everything always look good in 2D rendering.
Here is my code: https://github.com/hongsukchoi/GVHMR_vis/blob/hongsuk/mp_gloval_viser_vis.py

Again, thank you for your great work!!

@hongsukchoi hongsukchoi changed the title Failure case of GVHMR in PoseTrack video Fail to put multiple persons in the same world frame Oct 13, 2024
@zehongs
Copy link
Member

zehongs commented Oct 23, 2024

Hi @hongsukchoi, sorry for the late reply. As you mentioned, GVHMR doesn't explicitly model the camera transformation from world to camera, which is also the case for WHAM. To solve this problem, an extra global optimization of human motion and camera motion is required. By the way, I think TRAM (ECCV24) might be suitable for your project.

@hongsukchoi
Copy link
Author

Thanks for the reply!

I found this paper too. If they are fast and easy to use, this would be great.
https://openaccess.thecvf.com/content/CVPR2024/papers/Zhao_Synergistic_Global-space_Camera_and_Human_Reconstruction_from_Videos_CVPR_2024_paper.pdf

I have one more request... Could you clarify your method about the gravity coordinate?

I think the gravity coordinate frame is also defined per person.

@zehongs
Copy link
Member

zehongs commented Oct 25, 2024

Yes, that's correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants