Skip to content

Latest commit

 

History

History
13 lines (12 loc) · 834 Bytes

README.md

File metadata and controls

13 lines (12 loc) · 834 Bytes

Seed-VC

Zero-shot voice conversion trained according to the scheme described in SEED-TTS.
The VC quality is surprisingly good in terms of both audio quality and timbre similarity. We decide to continue along this pathway see where it can achieve.

TODO:

  • Release code
  • Release v0.1 pretrained model: Hugging Face
  • Huggingface space demo: Hugging Face
  • HTML demo page (maybe with comparisons to other VC models): Demo
  • Code for training on custom data
  • Streaming inference
  • Potential architecture improvements
  • More to be added