Feature normalization can cause NaN to appear #64

rcontrai · 2022-06-16T12:20:46Z

I was trying to fine-tune the model on a french corpus when I realized the loss kept turning into NaN, which ruined the model's parameters.

After some investigation I found a culprit : the code that normalises the audio features (specifically allosaurus.pm.utils.feature_cmvn()) has several problems that can cause the features to become NaN, causing NaNs to appear further down the line during training.

First, on line 10, the computation for spk_std is numerically unstable and can find a negative variance (for instance, -0.156 whereas numpy.var() finds 4.50e-07), and then computing its square root returns NaN.
This can be fixed by replacing this line with spk.std = np.std(feature, axis=0) (also line 9 can be removed) .

Second, on line 12, there is a division by the standard deviation, but there is no guarantee that it is not 0. As a result, features can be turned into NaN when their variance is null.
This can be fixed by adding the line spk_std += (spk_std == 0.), which replaces the zeros with ones, before the division.

Here is a file for which these problems occur, taken from the Mozilla CommonVoice dataset.
FNH4QW-sample-0.wav.zip

The text was updated successfully, but these errors were encountered:

xinjli · 2022-06-17T17:49:23Z

Hi, thanks for the detailed comments and suggestions!
I did not know that std computing can be unstable, thanks for debugging this.
I will prepare a fix to update those.

Thanks a lot!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature normalization can cause NaN to appear #64

Feature normalization can cause NaN to appear #64

rcontrai commented Jun 16, 2022

xinjli commented Jun 17, 2022

Feature normalization can cause NaN to appear #64

Feature normalization can cause NaN to appear #64

Comments

rcontrai commented Jun 16, 2022

xinjli commented Jun 17, 2022