You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that in the recent architecture improvements, modules for RoPE positional encoding and adding Prompts in Cross Attention were included. However, it seems that the newly released two Parler-TTS checkpoints did not utilize these features (if I understood correctly). Do you have any ablation study results on the impact of using RoPE positional encoding and adding Prompts in Cross Attention? I’m interested in understanding how each of these modules affects the final model performance.
Additionally,is there a plan to update the training guide for the latest checkpoints? I’m particularly keen on learning how to fine-tune the new checkpoints.
Thank you for your amazing work!
The text was updated successfully, but these errors were encountered:
Hello,
I noticed that in the recent architecture improvements, modules for RoPE positional encoding and adding Prompts in Cross Attention were included. However, it seems that the newly released two Parler-TTS checkpoints did not utilize these features (if I understood correctly). Do you have any ablation study results on the impact of using RoPE positional encoding and adding Prompts in Cross Attention? I’m interested in understanding how each of these modules affects the final model performance.
Additionally,is there a plan to update the training guide for the latest checkpoints? I’m particularly keen on learning how to fine-tune the new checkpoints.
Thank you for your amazing work!
The text was updated successfully, but these errors were encountered: