Text to speech implementation for Unity 3D

Overview
- Model
- Tokenizer
How to run ljspeech-jets-onnx in Unity 3D with Sentis
The Story
- Muse Chat history of importing ljspeech-jets-onnx into Unity 3D

Overview

This repo contains a text to speech implementation for Unity 3D. Because it's using the python.scripting package, it's not possible to make a build. If you run it in the editor, it will output the audio and put an output.wav file into the StreamingAssets folder. See a sample output here here.

Model

I'm using the Unity Sentis Package together with ljspeech-jets-onnx. The ONNX model is not included in this repo and has to be downloaded separately. You can download it from Hugging Face ljspeech-jets-onnx.

Tokenizer

The tokenization is done with the help of the python.scripting package. Note: Note: The package can only be used in the Editor. To perform tokenization into phonemes without Python, have a look at this post.

How to run ljspeech-jets-onnx in Unity 3D with Sentis

You need to download the ONNX file from https://huggingface.co/NeuML/ljspeech-jets-onnx/tree/main. The model.onnx Assets/[TTS]/Data Models/ljspeech-jets-onnx/model.onnx not part of the repository and ignored via the .gitignore to save LFS space.
You might need to reimport the model, just right click the model asset and reimport it. (Explanation https://discussions.unity.com/t/binarizer-sample-add-a-custom-layer-only-needs-a-reimport/279200).
The import will still display an error, but it can be disregarded. We will be removing the final layers.
You can execute the scene "TTS Test", which will produce Assets\StreamingAssets\output.wav for the string "Hello World! I wish I could speak."

The Story

This was my first attempt at using Sentis over a weekend, and I had no prior experience working with AI or ML code.

I spent several hours searching for other ONNX models without the "If" operator but didn't find any. I then got more help at https://discussions.unity.com/t/model-didnt-import-ljspeech-jets-onnx/265609/13.

As pointed out in the forum, modifying the ONNX outside of Unity might have been faster, especially with the assistance of Chat GPT-4 providing me with step-by-step guidance. What wasn't feasible was using the code interpreter, as it lacked access to the ONNX library.

Another option would have been to learn how to create an ONNX myself; however, I did not want to take that route this time.

A highly beneficial tool was https://netron.app/. It's a handy tool to understand what the ONNX is doing. Here is a screenshot of the sections with the "If" operator as well as the input and outputs of ljspeech-jets-onnx:

Muse Chat history of importing ljspeech-jets-onnx into Unity 3D

I posted some questions that I could have answered myself, but I thought it would be fun to see where the conversation led.

You can find a text copy of the conversation at Copy Of Caht
And here are the better readable images

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.vscode		.vscode
Assets		Assets
Packages		Packages
ProjectSettings		ProjectSettings
docs		docs
.gitattributes		.gitattributes
.gitignore		.gitignore
.vsconfig		.vsconfig
LICENSE.md		LICENSE.md
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text to speech implementation for Unity 3D

Overview

Model

Tokenizer

How to run ljspeech-jets-onnx in Unity 3D with Sentis

The Story

Muse Chat history of importing ljspeech-jets-onnx into Unity 3D

About

Releases

Packages

Languages

License

mrwellmann/Unity-Text-To-Seech-with-Sentis

Folders and files

Latest commit

History

Repository files navigation

Text to speech implementation for Unity 3D

Overview

Model

Tokenizer

How to run ljspeech-jets-onnx in Unity 3D with Sentis

The Story

Muse Chat history of importing ljspeech-jets-onnx into Unity 3D

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages