We provide a beginner recipe to demonstrate how to train a high quality HiFi-GAN speech vocoder. Specially, it is also an official implementation of our paper "Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder". Some demos can be seen here.
Neural vocoder generates audible waveforms from acoustic representations, which is one of the key parts for current audio generation systems. Until now, Amphion has supported various widely-used vocoders according to different vocoder types, including:
- GAN-based vocoders, which we have provided a unified recipe :
- Flow-based vocoders (👨💻 developing):
- Diffusion-based vocoders, which we have provided a unified recipe:
- Auto-regressive based vocoders (👨💻 developing):