Skip to content

oomol-lab/epub2speech

Repository files navigation

EPUB to Speech

English | 中文

Convert EPUB e-books into high-quality audiobooks using Azure Text-to-Speech technology.

Features

  • 📚 EPUB Support: Compatible with EPUB 2 and EPUB 3 formats
  • 🎙️ High-Quality TTS: Uses Azure Cognitive Services Speech for natural voice synthesis
  • 🌍 Multi-Language Support: Supports various languages and voices via Azure TTS
  • 📱 M4B Output: Generates standard M4B audiobook format with chapter navigation
  • 🔧 CLI Interface: Easy-to-use command-line tool with progress tracking

Basic Usage

Convert an EPUB file to audiobook:

epub2speech input.epub output.m4b --voice zh-CN-XiaoxiaoNeural --azure-key YOUR_KEY --azure-region YOUR_REGION

Installation

Prerequisites

  • Python 3.11 or higher
  • FFmpeg (for audio processing)
  • Azure Speech Service credentials

Install Dependencies

# Install Python dependencies
pip install poetry
poetry install

# Install FFmpeg
# macOS: brew install ffmpeg
# Ubuntu/Debian: sudo apt install ffmpeg
# Windows: Download from https://ffmpeg.org/download.html

Azure Speech Service Setup

  1. Create an Azure account at https://azure.microsoft.com
  2. Create a Speech Service resource in Azure Portal
  3. Get your subscription key and region from the Azure dashboard

Quick Start

Environment Variables

Set your Azure credentials as environment variables:

export AZURE_SPEECH_KEY="your-subscription-key"
export AZURE_SPEECH_REGION="your-region"

epub2speech input.epub output.m4b --voice zh-CN-XiaoxiaoNeural

Advanced Options

# Limit to first 5 chapters
epub2speech input.epub output.m4b --voice en-US-AriaNeural --max-chapters 5

# Use custom workspace directory
epub2speech input.epub output.m4b --voice zh-CN-YunxiNeural --workspace /tmp/my-workspace

# Quiet mode (no progress output)
epub2speech input.epub output.m4b --voice ja-JP-NanamiNeural --quiet

Available Voices

For a complete list, see Azure Neural Voices.

How It Works

  1. EPUB Parsing: Extracts text content and metadata from EPUB files
  2. Chapter Detection: Identifies chapters using EPUB navigation data
  3. Text Processing: Cleans and segments text for optimal speech synthesis
  4. Audio Generation: Converts text to speech using Azure TTS
  5. M4B Creation: Combines audio files with chapter metadata into M4B format

Development

Running Tests

python test.py

Run specific test modules:

python test.py --test test_epub_picker
python test.py --test test_tts

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Support

For issues and questions:

  1. Check existing GitHub issues
  2. Create a new issue with detailed information
  3. Include EPUB file samples if relevant (ensure no copyright restrictions)”,“file_path”:

About

No description, website, or topics provided.

Resources

License

Security policy

Stars

Watchers

Forks

Packages

No packages published