Kokoro library and CLI tool in Rust. Batteries included, just a single binary, no runtime dependencies1.
🚨 This project is currently usable, but pretty barebones and far from finished. 🚨
Significant changes can happen.
Use Kokoro in your terminal with an everything-included binary, or easily embed it in your project as a library.
In short, this project embeds a Kokoro onnx file and various Kokoro voice files (currently not all of them), and runs the model using the ort crate, which is statically linked, meaning that everything is included in the final binary.
The CLI tool additionally uses Phonemoro as its phonemizer, which also embeds everything it needs, resulting in a fully functioning text-to-speech system within a single binary.
Features:
- easy to build and use
- no special runtime dependencies
- single binary with everything embedded
- portable
- doesn't use espeak, so none of the licensing issues
- suitable for mobile use2
Since this project is based on Kokoro, a model file (onnx) and the voice files are needed. You can either download them manually, or enable the download-data feature and automatically download them during build.
Both ways are described.
- Add
speakoroto your project:
- Easy Way (Recommended)
- Add
speakorodirectly to your project, with thedownload-dataflag enabled:$ cargo add --git https://github.com/lastleon/speakoro speakoro -F download-data
- Add
⚠️ Warning:This automatically downloads the necessary files from Huggingface. If you don't want that, proceed with Harder Way.
- Harder Way
Use this only if you're uncomfortable downloading from the internet, or you want to use your own data.- Clone this repository to a location outside your project and enter it:
$ git clone https://github.com/lastleon/speakoro && cd speakoro
- Create the onnx model and voice directories:
$ mkdir -p data/{onnx,voice}- Download the desired model and the english voices from onnx-community/Kokoro-82M-v1.0-ONNX, place the model in
data/onnx, and place the voices indata/voices. - Back in your project, add
speakoroas a dependency:
$ cargo add --path <path-to-the-cloned-speakoro-repo> speakoro
- Set the
SPEAKORO_MODEL_FILEenvironment variable to choose which model should used (and downloaded, if enabled). You can either:
- Set it within the
.cargo/config.tomlfile in your project:
[env]
# See https://huggingface.co/onnx-community/Kokoro-82M-v1.0-ONNX/tree/main/onnx for a list of available options. Note that not all models work, you have to test that out.
# Recommendations: model.onnx, model_fp16.onnx, model_uint8.onnx
SPEAKORO_MODEL_FILE = "model_uint8.onnx"- Or set the variable during the build:
$ SPEAKORO_MODEL_FILE=model_uint8.onnx cargo build --release- Use the library like so:
use speakoro::{Kokoro, KokoroVoice};
use anyhow::Result;
fn main() -> Result<()> {
let kokoro = Kokoro::new()?;
let audio = kokoro.phonemes2audio("həlˈO wˈɜɹld", KokoroVoice::AF_BELLA, 1f32)?;
speakoro::utils::write_to_wav(audio, "audio.wav")?;
Ok(())
}💡 Note:
To see an end-to-end example, go to the
speakoro-clicrate. It utilizes the closely related Phonemoro project as the phonemizer.
This uses Phonemoro as the phonemizer.
⚠️ Warning:Building the CLI tool requires downloading the necessary data for both
speakoroandphonemoro.
Since building phonemoro without downloading the data needs some setup in a directory outside this project, meaning it is kind of a involved process, the choice was made to not provide a feature flag for a build without downloading data. Having that flag would only be meaningful if phonemoro was also built without downloading data, but for that, it would need to be added as a dependency in a different way. At the end of this section, an offline build is described.
The onnx model and voice files are downloaded from Huggingface, the data phonemoro needs is downloaded from the releases page of phonemoro.
- Clone this repository:
$ git clone https://github.com/lastleon/speakoro-
(Optional): Change the Kokoro model you want to use. For that, follow step 2 of Usage > As a Library. By default,
model_uint8.onnxis used. -
Build
speakoro-cli:
$ cargo build -p speakoro-cli --release- Usage:
$ ./target/release/speakoro-cli --help
Usage: speakoro-cli [OPTIONS] <text>
Arguments:
<text> Pass the text that should be converted to speech. If the flag --phonemes is set, this will be interpreted as raw phonemes.
Options:
-v, --voice <voice> Set which voice should be used to generate audio. [default: af_bella] [possible values: af_heart, af_bella, af_nicole, af_aoede, bf_emma, bf_isabella, am_adam, am_fenrir, bm_daniel]
-p, --phonemes If set, the passed text will be interpreted as phonemes.
-o, --out <out> Set filepath to where the audio will be written to. Note that the output format is WAV. [default: audio.wav]
-h, --help Print help
-V, --version Print versionOffline Build:
- Clone this repository and add the necessary data as described in Usage > As a Library (Harder Way)
- Go to Phonemoro, and follow the offline build instructions (
Usage (lib) > Harder Way) to use it as a library, but don't add it tospeakoro-cliyet - Go to
speakoro-cliand replace thephonemorodependency like so:
$ cargo rm phonemoro && cargo add --path <path-to-the-cloned-phonemoro-repo> phonemoro- Remove the
download-datafeature fromspeakoro:
$ cargo rm speakoro && cargo add --path .. speakoro- Optionally change the Kokoro model like described before, then build
speakoro-cli:
$ cargo build -p speakoro-cli --releaseTODO (main limitation is ort, which you might need to manually build)
- hexgrad/Kokoro: The model this library is based on.
- onnx-community/Kokoro-82M-v1.0-ONNX: The quantized and to onnx converted models this library uses.
- lucasjinreal/Kokoros: Another "Kokoro in Rust" project I recently found out about. It has more features and almost certainly better phonemization, since it uses espeak as a backend. However, it needs Python (and possibly PyTorch) for the installation, requires vendored espeak, Kokoro onnx models and voice data in external directories.
So, if you need any of the additional featuresKokorosprovides, or better phonemization, useKokoros. If you need a self contained binary, want easier installation or usage as a library, or don't want to use espeak because of licensing issues, usespeakoro.
This project utilizes data from onnx-community/Kokoro-82M-v1.0-ONNX, licensed under the Apache License 2.0.
speakoro is licensed under the MIT License.
Footnotes
-
Apart from the usual suspects, such as
libc.so. ↩ -
Depending on the platform, you might need to build the onnx runtime yourself, though. Also yes, this is kind of fast enough to properly run on a phone! :) ↩