- Initial recorded version.
- Includes basic features like TalkNet wav transfer, basic voice parameter controls with WORLD, support for neuTalk-specific speaker library format (.ntk, .ntk_cfg, .ntkpkg)
- English and Japanese synthesis with TalkNet/hifi-gan and tacotron2/hifi-gan respectively. (hifi-gan models are the same format/training script)
- Multiple Voice Models under a single Speaker with 'sub-model' selection (only supports a single language)
- Standard v1.1β | ~1h15m data
- Alpha v1.0 | ~1h42m data
- Mellow v0.8β | ~16m data
- Sing (Experimental) v1.0
- ~23m data
- Standard v1.1β | ~1h30m data
- ~26h data from 4 speakers (LJ-Speech, Tiger (Standard), Glim, another random male speaker)
- ~3h data from many speakers (Mozilla Common Voice Japanese)