โ ๏ธ Disclaimer
This project is for learning & testing purposes only. For production use, please use the official OpenAI TTS service.
๐จ IMPORTANT DEVELOPMENT NOTICE ๐จ
โ ๏ธ The v2 branch is currently under active development and is not recommended for production use. ๐ For stable documentation and usage, please refer to the v1 documentation.
English | ไธญๆ
TTSFM is a API server that's fully compatible with OpenAI's Text-to-Speech (TTS) API format.
๐ฎ Try it now: Official Demo
ttsfm/
โโโ app.py # Main Flask application
โโโ celery_worker.py # Celery configuration and tasks
โโโ requirements.txt # Python dependencies
โโโ static/ # Frontend resources
โ โโโ index.html # English interface
โ โโโ index_zh.html # Chinese interface
โ โโโ script.js # Frontend JavaScript
โ โโโ styles.css # Frontend styles
โโโ voices/ # Voice samples
โโโ Dockerfile # Docker configuration
โโโ docker-entrypoint.sh # Docker startup script
โโโ .env.example # Environment variables template
โโโ .env # Environment variables
โโโ .gitignore # Git ignore rules
โโโ LICENSE # MIT License
โโโ README.md # English documentation
โโโ README_CN.md # Chinese documentation
โโโ test_api.py # API test suite
โโโ test_queue.py # Queue test suite
โโโ .github/ # GitHub workflows
- Python 3.13 or higher
- Redis server
- Docker (optional)
# Pull the latest image
docker pull dbcccc/ttsfm:latest
# Run the container
docker run -d \
--name ttsfm \
-p 7000:7000 \
-p 6379:6379 \
-v $(pwd)/voices:/app/voices \
dbcccc/ttsfm:latest
- Clone the repository:
git clone https://github.com/dbccccccc/ttsfm.git
cd ttsfm
- Install dependencies:
pip install -r requirements.txt
- Start Redis server:
# On Windows
redis-server
# On Linux/macOS
sudo service redis-server start
- Start Celery worker:
celery -A celery_worker.celery worker --pool=solo -l info
- Start the server:
# Development (not recommended for production)
python app.py
# Production (recommended)
waitress-serve --host=0.0.0.0 --port=7000 app:app
Copy .env.example
to .env
and modify as needed:
cp .env.example .env
HOST
: Server host (default: 0.0.0.0)PORT
: Server port (default: 7000)VERIFY_SSL
: SSL verification (default: true)MAX_QUEUE_SIZE
: Maximum queue size (default: 100)RATE_LIMIT_REQUESTS
: Rate limit requests per window (default: 30)RATE_LIMIT_WINDOW
: Rate limit window in seconds (default: 60)
CELERY_BROKER_URL
: Redis broker URL (default: redis://localhost:6379/0)CELERY_RESULT_BACKEND
: Redis result backend URL (default: redis://localhost:6379/0)
POST /v1/audio/speech
Request body:
{
"input": "Hello, world!",
"voice": "alloy",
"response_format": "mp3",
"instructions": "Speak in a cheerful tone"
}
input
(required): The text to convert to speechvoice
(required): The voice to use. Supported voices: alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verseresponse_format
(optional): The format of the audio output. Default: mp3. Supported formats: mp3, opus, aac, flac, wav, pcminstructions
(optional): Additional instructions for voice modulation
- Success: Returns audio data with appropriate content type
- Error: Returns JSON with error message and status code
GET /api/queue-size
Response:
{
"queue_size": 5,
"max_queue_size": 100
}
GET /api/voice-sample/{voice}
voice
(required): The voice to get a sample for. Must be one of: alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse
- Success: Returns MP3 audio sample
- Error: Returns JSON with error message and status code
GET /api/version
Response:
{
"version": "v2.0.0-alpha1"
}
This project is licensed under the MIT License - see the LICENSE file for details.