The endless scroll is a curse. We're drowning in text. This is the forge where we give words a voice—a pirate's growl, a mad scientist's cackle, an emo teen's lament. Feed it any text, and let it speak. No more reading. Only listening.
For those who can't wait to hear the chaos:
pip install uv # If you don't have uv
uv venv
source .venv/bin/activate
uv pip install -r requirements.txt
export OPENAI_API_KEY="your_api_key_here" # Replace with your actual key
cat your_blog_post.md | head -n 5 | ./main.py
Every proper lab needs its foundations. We favour the speed of uv
, but the old ways of pip
work too.
-
Install
uv
: If you don't haveuv
installed, you can install it using pip:pip install uv
-
Create and Activate Virtual Environment:
uv venv source .venv/bin/activate
-
Install Dependencies:
uv pip install -r requirements.txt
-
Set OpenAI API Key: The machine needs its fuel. Set your OpenAI API key as an environment variable:
export OPENAI_API_KEY="your_api_key_here"
Before you can invoke the machine, you must awaken the environment in each new terminal session. It's a crucial step, lest you be cast into the digital abyss.
source .venv/bin/activate
Once the environment is alive, you can command the script to transmute text into speech. It drinks from files or directly from the ether (stdin
). You can stream the audio live or bottle it in a file.
./main.py your_text_file.txt
./main.py your_text_file.txt --save
cat your_text_file.txt | head -n 5 | ./main.py
cat your_text_file.txt | ./main.py --save
This will generate an MP3 file in the out/
directory. The filename follows the pattern: [voice]_[instructor]_[first_four_words_of_input]_[timestamp].mp3
.
The default voice is nova
and the default instructor is pirate
.
You can customize the voice and instructor persona using the --voice
and --instructor
flags.
-
--voice
: Specifies the voice to use. Available voices:ballad
,coral
,nova
,sage
. Example:--voice coral
-
--instructor
: Specifies the instructor persona. Available instructors:pirate
,mad_scientist
,emo_teenager
. Example:--instructor mad_scientist
./main.py your_text_file.txt --voice ballad --instructor mad_scientist
cat your_text_file.txt | head -n 5 | ./main.py --voice nova --instructor emo_teenager
The core of the machine is yours to tinker with. The main.py
script holds the lists of reasonable_voices
and instructors
. Bend them to your will. Add new personalities. Experiment.
Raw text is messy. It's full of digital detritus—links, tags, and other junk not meant for the spoken word. To ensure a clean transmutation, we first pass the text through a cleansing ritual. This script strips the noise, leaving only the pure essence of the message.
The script performs the following cleaning operations:
- Removes URLs: It strips out the URL part of a markdown link, leaving only the descriptive text. For example,
[Google](https://google.com)
becomesGoogle
. - Extracts Image Alt Text: It takes the alt text from an image link and uses that as the spoken text. For example,

becomesA picture of a cat
. - Strips HTML Tags: All HTML tags are removed.
- Handles Code Blocks: It replaces entire code blocks with the phrase "a code snippet follows" to avoid reading code aloud.
- Removes Structural Markers: It removes heading markers (
#
), and list markers (*
,-
) from the start of lines.
Pipe your markdown file through the cleaner before sending it to the main script:
cat your_blog_post.md | ./clean_markdown.py | ./main.py
A mad scientist's creation must be robust. These tests ensure the machine is calibrated and ready to roar. Run them to verify that both pip
and uv
installations can withstand the pressure.
To run the tests:
-
Ensure you are in the project root directory.
-
Run the
pip
installation test:./tests/test_pip_install.sh
-
Run the
uv
installation test:./tests/test_uv_install.sh
These tests will create temporary directories, simulate a fresh installation, and check if the main.py
script successfully generates an output file.