trv

Transform slides and speaker notes into video.

Installation

cargo install trv

Or with cargo binstall:

cargo binstall trv

Usage

Create a Typst presentation with speaker notes:

#import "@preview/polylux:0.4.0": *

#set page(paper: "presentation-16-9")

#set text(size: 25pt)

#slide[
    #toolbox.pdfpc.speaker-note(
    ```md
    What if you could show code in a video?
    ```
    )

    \
    #align(center)[Code examples or code videos?]
]

To create a video without an API key nor an internet connection, you can self-host Kokoros. See the Kokoros section for more information. Or for the state-of-the-art, see the Zyphra Zonos section.

A simple alternative is to use the hosted version at https://kokoros.transformrs.org. For example, this command creates a video using the hosted service:

$ trv --input=presentation.typ \
    --provider=openai-compatible(kokoros.transformrs.org) \
    --model=tts-1 \
    --voice=bm_lewis \
    --audio-format=wav \
    --release

To create a video from the presentation with DeepInfra, run:

$ export DEEPINFRA_KEY="<YOUR KEY>"

$ trv --input=presentation.typ
 INFO Generating audio file for slide 0
 INFO Generating audio file for slide 1
 INFO Creating video clip _out/1.mp4
 INFO Created video clip _out/1.mp4
 INFO Creating video clip _out/2.mp4
 INFO Created video clip _out/2.mp4
 INFO Concatenated video clips into _out/out.mp4

Now, the presentation is available as _out/out.mp4. A benefit of DeepInfra is that they have some extra voices compared to Kokoros.

Kokoros

To use Kokoros locally, the easiest way is to use the Docker image.

$ git clone https://github.com/lucasjinreal/Kokoros.git

$ cd Kokoros/

$ docker build --rm -t kokoros .

$ docker run -it --rm -p 3000:3000 kokoros openai

Then, you can use the Docker image as the provider:

$ trv --input=presentation.typ --provider=openai-compatible(localhost:3000)

Zyphra Zonos

To use the Zyphra Zonos model, you need 8 GB of VRAM. So it's probably easiest to use DeepInfra:

$ export DEEPINFRA_KEY="<YOUR KEY>"

$ trv --input=presentation.typ \
    --model='Zyphra/Zonos-v0.1-hybrid' \
    --voice='american_male' \
    --release

Portrait Video

To create a portait video, like a YouTube Short, you can set the page to

#set page(width: 259.2pt, height: 460.8pt)

The rest should work as usual. This will automatically create slides with 1080 x 1920 resolution since Typst is set to 300 DPI. Next, ffmpeg will automatically scale the video to a height of 1920p so in this case the height will not be changed. For landscape videos, it might scale the image down to 1920p.

About Audio

Audio is generated using the transformrs crate. It supports multiple providers, including DeepInfra, OpenAI, and Google.

So trv should also work with providers other than DeepInfra. However, during testing, I got the best results with Kokoros or DeepInfra for the lowest price.

For example, OpenAI text-to-speech requires any video to contain a "clear disclosure" that the voice they are hearing is AI-generated.

Google, meanwhile, has the best text-to-speech engine that I've found as part of Gemini 2.0 Flash Experimental. However, audio output is not yet available via the API.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github		.github
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

trv

Installation

Usage

Kokoros

Zyphra Zonos

Portrait Video

About Audio

About

Releases

Packages

Languages

License

simonsan-contrib/trv

Folders and files

Latest commit

History

Repository files navigation

trv

Installation

Usage

Kokoros

Zyphra Zonos

Portrait Video

About Audio

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages