Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About reduction_factor_schedule #79

Closed
taylorlu opened this issue Jan 23, 2021 · 3 comments
Closed

About reduction_factor_schedule #79

taylorlu opened this issue Jan 23, 2021 · 3 comments

Comments

@taylorlu
Copy link

Hi, thanks for sharing this great work.
I want to ask the training skill why we need to use dynamic input length in decode module, the relative variables self.max_r and self.r can be found in models.py.
The purpose seems to let it harder to train in the beginning since we only use less data to predict the whole mel sequence, but getting easier when the reduction_factor_schedule changes smaller which indicates larger input length. It looks a bit like simulated annealing algorithm, does it really work as I described? What will happen when self.max_r and self.r not the same.

@myagues
Copy link

myagues commented Jan 28, 2021

Your intuition is right!
You will use large values for reduction_factor, at the start of the training, because missing data will make the model to rely on attention alignments. You can also imagine it as a type of dropout for auto-regressive models. Then, you can begin lowering the reduction_factor, which will improve the predicted mel spectrogram details, because the model will have more information. Here is an explanation using a Tacotron2 model.

What will happen when self.max_r and self.r not the same.

You need your model layers to be shape static, so you will initialize your projection output with the largest value of reduction_factor_schedule:

self.final_proj_mel = tf.keras.layers.Dense(self.mel_channels * self.max_r, name='FinalProj')

When you reduce the value of self.r during your training, your layer will be the same size, but you will select just a part of it:

out_proj = self.final_proj_mel(dec_output)[:, :, :self.r * self.mel_channels]
b = int(tf.shape(out_proj)[0])
t = int(tf.shape(out_proj)[1])
mel = tf.reshape(out_proj, (b, t * self.r, self.mel_channels))

@taylorlu
Copy link
Author

taylorlu commented Feb 1, 2021

Thanks for your elaboration.

@cfrancesco
Copy link
Contributor

Thank you @myagues, excellent explanation.

@taylorlu taylorlu closed this as completed Feb 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants