Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calibrate confidence scores #273

Open
sfmig opened this issue Aug 16, 2024 · 2 comments
Open

Calibrate confidence scores #273

sfmig opened this issue Aug 16, 2024 · 2 comments
Labels
enhancement New optional feature

Comments

@sfmig
Copy link
Contributor

sfmig commented Aug 16, 2024

Is your feature request related to a problem? Please describe.
We usually interpret confidence scores as a proxy for the error estimate in the keypoints prediction. However, it is well known that neural networks tend to be "overly confident" in their predictions. For example, for the multiclass classification case, reference [1] says:

the softmax output of modern neural networks, which typically is interpreted as a categorical distribution in classification, is poorly calibrated.

It would be very useful to be able to produce calibrated confidence scores of the keypoint predictions. That would allow us to compare results across frameworks, better filter high/low confidence values, and better interpret model performance.

Describe the solution you'd like
We could consider having a method in movement that calibrates confidence scores.

We could implement something similar to what keypoint-moseq does. They have functionality to fit a linear model to the relationship between keypoint error and confidence:

[the function] creates a widget for interactive annotation in jupyter lab. Users mark correct keypoint locations for a sequence of frames, and a regression line is fit to the log(confidence), log(error) pairs obtained through annotation. The regression coefficients are used during modeling to set a prior on the noise level for each keypoint on each frame.

Describe alternatives you've considered
\

Additional context
Nice explanations for the case of classification (note that in pose estimation we do a regression problem, not a classification one):

From a quick search I found:

  • [1] this paper, on the calibration of human pose estimation. They propose a neural network that learns specific adjustments for a pose estimator. Seems out of scope for movement but may be a useful read to understand the problem better.
  • this paper for object detection, could be similarly useful.
@sfmig sfmig added the enhancement New optional feature label Aug 16, 2024
@sfmig
Copy link
Contributor Author

sfmig commented Aug 27, 2024

This euroscipy tutorial may be useful for this work

@sfmig
Copy link
Contributor Author

sfmig commented Mar 3, 2025

Note that pose estimation is a regression problem not a classification one - we probably want to look into ways of applying the same to a regression problem in a reasonable way.

For example, maybe we can "transform" the problem into a classification one by establishing that a keypoint is correctly predicted if close enough to the ground truth label. This seems reasonable at a first glance?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New optional feature
Projects
Status: 🤔 Triage
Development

No branches or pull requests

1 participant