Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential bug in score_norm_emotion.py #9

Open
islam-nassar opened this issue Apr 11, 2023 · 2 comments
Open

Potential bug in score_norm_emotion.py #9

islam-nassar opened this issue Apr 11, 2023 · 2 comments

Comments

@islam-nassar
Copy link

In L424 , the sum_scaled_tp and sum_scaled_fp are calculated based on the fhyp which is a filtered version of ihyp.. since you do not accumulate scaled_tp and scaled_fp as you did with tp and fp in L407, these scaled values will be underestimated. A proposed fix is to use ihyp instead of fhyp here or alternatively accumulate scaled_tp and scaled_fp as well.

@jfiscus
Copy link
Collaborator

jfiscus commented Apr 11, 2023

This is not a bug. (I was tripped by this during implementation.) ihyp is the full record of all the ref->system alignments but the loop on L377 generates scores for multiple correctness constraints (in iou_thresholds) by filtering ihyp AND (this is the messy part) the fhyp filter picks the single, last record if records share the same LLR. (This minimizes the number of points in a PR curve). Since cum_tp and cum_fp are made before the filter, all PR curve inflection points are preserved.

@islam-nassar
Copy link
Author

Thanks for your reply.

I understand the logic for cum_tp and cum_fp: since you apply cumulative sum before filtering ihyp (L407), you end up with the total tp and fp in the last record so they are preserved after the fhyp filter..

However, the same is not true for the scaled versions (pct_tp and pct_fp) because there is no cumulative sum for them. So pct_tp and pct_fp values for any records which share the same LLR would be lost after the fhyp filter (except for the last one).

Is my understanding correct? or totally wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants