Amphion Vocoder Recipe

Quick Start

We provide a beginner recipe to demonstrate how to train a high quality HiFi-GAN speech vocoder. Specially, it is also an official implementation of our paper "Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder". Some demos can be seen here.

Supported Models

Neural vocoder generates audible waveforms from acoustic representations, which is one of the key parts for current audio generation systems. Until now, Amphion has supported various widely-used vocoders according to different vocoder types, including:

GAN-based vocoders, which we have provided a unified recipe :
- MelGAN
- HiFi-GAN
- NSF-HiFiGAN
- BigVGAN
- APNet
Flow-based vocoders (👨‍💻 developing):
- WaveGlow
Diffusion-based vocoders, which we have provided a unified recipe:
- Diffwave
Auto-regressive based vocoders (👨‍💻 developing):
- WaveNet
- WaveRNN

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Amphion Vocoder Recipe

Quick Start

Supported Models

Files

README.md

Latest commit

History

README.md

File metadata and controls

Amphion Vocoder Recipe

Quick Start

Supported Models