Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pose Estimation Model Metrics: Inconsistencies between PeekingDuck Docs and TensorFlow Website #750

Open
saifkhichi96 opened this issue Apr 4, 2023 · 2 comments

Comments

@saifkhichi96
Copy link

Hello PeekingDuck team,

I have recently explored the PeekingDuck framework and noticed some inconsistencies in the reported metrics for pose estimation models, particularly MoveNet and PoseNet when comparing them with the results on the TensorFlow website.

In the PeekingDuck documentation (link), the Average Precision (AP) for MoveNet is stated as 7.3. However, the TensorFlow website (link) indicates an AP of 57.4 for even the quantized version of the same model. This difference suggests that the model's average precision in PeekingDuck is significantly lower than expected, but you state in your docs that "The evaluation metrics have been compared with the original repository of the respective pose estimation models for consistency." Which "original repository" were metrics for PoseNet and MoveNet compared with for consistency?

Could you kindly provide some insights into this discrepancy? Are there any variations in the evaluation setup or methods that might account for this substantial difference in reported metrics? Understanding the reasons behind these contrasting results is crucial for accurately assessing the performance of the models implemented in PeekingDuck.

Thank you for your help and support!

@ongtw
Copy link
Contributor

ongtw commented Apr 10, 2023

Hi, the metrics are different because the Tensorflow website states that "Accuracy (mAP) numbers are measured on a subset of the COCO dataset in which we filter and crop each image to contain only one person" --- see attached screenshot below:
image

Whereas PeekingDuck's reported AP is across the entire COCO dataset (not just a subset), including images with multiple persons (not just one person).
So the AP numbers from PeekingDuck's docs and the mAP numbers from Tensorflow website cannot be directly compared.

However, if you look at the relative numbers, the MoveNet model is indeed better than the PoseNet model.

@saifkhichi96
Copy link
Author

But wouldn't it be better if the results were reported in a similar way to TensorFlow website? It is my understanding that it is common practice for most human pose estimation models to use a detector first to crop out the people and then compute the accuracy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants