Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utterance ref cannot be empty ? #98

Open
KarelVesely84 opened this issue Dec 4, 2024 · 6 comments
Open

utterance ref cannot be empty ? #98

KarelVesely84 opened this issue Dec 4, 2024 · 6 comments

Comments

@KarelVesely84
Copy link

Hello,
is there a good reason why an utterance ref is required to be non-empty ?
https://github.com/jitsi/jiwer/blob/9db6e4649dfff1e91de5640e224ea51de01b0a50/jiwer/process.py#L158C1-L159C69

IMHO, i'd expect that it can be empty (sclite behavior).
It is a valid situation, if the utterance in test set contains just silence, it's reference is empty,
and the ASR system should produce an empty string and not hallucinate any symbol.

I hacked it accordingly here:
https://github.com/KarelVesely84/jiwer/tree/allow_empty_ref

Best regards
Karel Vesely

@nikvaessen
Copy link
Collaborator

My reasoning at the time was that evaluation datasets like test-clean of Librispeech do not have silent utterances, so it is better to fail fast and let the user know they made a mistake (like substituting the reference and hypothesis list).

@KarelVesely84
Copy link
Author

Ok, would you be open to changing the behavior ?

@nikvaessen
Copy link
Collaborator

Yes, do you think a UserWarning is more appropriate? I think with systems like Whisper, it is valid to test empty reference strings...

@KarelVesely84
Copy link
Author

Yes, the UserWarning would be good.
It would warn the user in the log, and it would not stop the WER calculation.

@nikvaessen
Copy link
Collaborator

nikvaessen commented Dec 12, 2024

Do you know how sclite handles the edge-case where we only consider one utterance, with an empty reference? This leads to a division by 0.

@KarelVesely84
Copy link
Author

KarelVesely84 commented Dec 12, 2024

Not sure how sclite treats that case.

Anyway, this is unlikely to happen, as sclite is typically used with test-sets with touhsands of utterances.
To have a credible WER for an ASR system, certain amount of words/utterances is necessary in the ref.

With 1 utterance, 0 ref-word edge case, you are right, that this leads to division by zero.

So the WER sholud be be Inf or NaN, i guess, in that case ?
(so that it is mathematically ok, according to the definition of WER = (S + D + I) / #REF)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants