CLI for audio transcription tool powered by whisper.cpp, designed for real-time transcription.
- π― Real-time transcription - Record and transcribe audio on the fly
- π Cross-platform - Works on Linux, macOS, and Windows
- βοΈ Configurable - Flexible configuration options
- π§ Multiple models - Support for various Whisper model sizes
- π» CLI-friendly - Easy-to-use command-line interface
- Install
whisper-cli
- Mac OS:
brew install whisper-cpp
- Other OS: Follow the whisper.cpp installation guide
- Mac OS:
- Install
ffmpeg
for audio recording- Mac OS:
brew install ffmpeg
- Linux: Use your package manager (e.g.,
apt install ffmpeg
) - Windows: Download from FFmpeg official site
- Mac OS:
- Audio recording capabilities (microphone)
- At least 4GB RAM (recommended for larger models)
Download the latest release for your platform from GitHub Releases:
One-line MacOS installation
sh -c "$(curl -fsSL https://raw.githubusercontent.com/nnanto/transcriber/main/scripts/install-macos.sh)"
Linux/macOS:
# Download and install
curl -L https://github.com/nnanto/transcriber/releases/download/latest/transcriber-linux-amd64.tar.gz | tar -xz
sudo mv transcriber-* /usr/local/bin/transcriber
# Or for macOS
curl -L https://github.com/nnanto/transcriber/releases/download/latest/transcriber-darwin-amd64.tar.gz | tar -xz
sudo mv transcriber-* /usr/local/bin/transcriber
# Make executable
chmod +x /usr/local/bin/transcriber
Windows:
- Download
transcriber-windows-amd64.zip
from releases page - Extract the ZIP file
- Add the extracted folder to your PATH or move
transcriber.exe
to a folder in your PATH
Verify installation:
transcriber version
git clone https://github.com/nnanto/transcriber.git
cd transcriber
make install
git clone https://github.com/nnanto/transcriber.git
cd transcriber
make build
-
Install whisper-cli if not already done. See prerequisites
-
Download a Whisper model (required on first use):
transcriber download-model
You can specify custom model using --model
option. Available models are found in the whisper.cpp HF models
You can also specify a custom model path in the configuration file.
- Start transcribing:
transcriber run --output ./transcriptions
- Check your configuration:
transcriber config
Command | Description | Example |
---|---|---|
run |
Record and transcribe in real-time | transcriber run --duration 2m |
process |
Process existing audio files | transcriber process --input ./audio |
config |
Show current configuration | transcriber config |
download |
Download Whisper models | transcriber download-model --model large |
stop |
Stop all running processes | transcriber stop |
version |
Show version info | transcriber version |
Start recording and transcribing immediately:
# Record for 30 minutes (default)
transcriber run
# Record for specific duration
transcriber run --duration 5m --output ./my-transcriptions
# Custom configuration
transcriber run --config ./custom-config --duration 1h
Download and manage Whisper models:
# Download specific model
transcriber download-model --model ggml-large-v3-turbo-q5_0
Available models are found in the whisper.cpp HF models
The configuration file is automatically created at ~/.transcriber/config.json
:
{
"model_path": "~/.transcriber/models/ggml-large-v3-turbo-q5_0.bin",
"language": "English",
"temp_dir": "/tmp/transcriber",
"output_format": "txt",
"whisper_cmd": "whisper-cli",
"recording_cmd": "ffmpeg",
"chunk_duration_in_secs": 30,
"min_required_unique_word_count": 5
}
- model_path: Path to the Whisper model file
- language: Language for transcription (e.g., "English", "Spanish", "auto")
- temp_dir: Directory for temporary audio files during processing
- output_format: Output format (
txt
,json
) - whisper_cmd: Command to use for Whisper transcription (default: "whisper-cli")
- recording_cmd: Command to use for audio recording (default: "ffmpeg")
- chunk_duration_in_secs: Duration in seconds for each audio chunk during real-time transcription (default: 30)
- min_required_unique_word_count: Minimum number of unique words required to process a chunk (default: 5)
transcriber/
βββ cmd.go # CLI command handling
βββ main.go # Application entry point
βββ transcriber.go # Core transcription logic
βββ config.go # Configuration management
βββ audio.go # Audio recording/processing
βββ models.go # Model download/management
βββ Makefile # Build automation
βββ README.md # This file
# Clone the repository
git clone https://github.com/nnanto/transcriber.git
cd transcriber
# Install dependencies
go mod download
# Build for development (with race detection)
make dev
# Build for production
make build
# Run tests
make test
# Build for all platforms
make release-local
# Download the required model first
transcriber download-model [--model ggml-base]
# Make sure the binary is executable
chmod +x transcriber
# Or install system-wide
make install
- Try using a smaller model (
ggml-tiny
orggml-base
) - Reduce recording quality in config
- Limit recording duration
- Check microphone permissions
- Verify audio device availability
- Test with shorter durations first
- Check the Issues page
- Review configuration with
transcriber config
- Enable verbose logging in development builds
- OS: Linux, macOS 10.14+, Windows 10+
- RAM: 2GB (4GB recommended)
- Storage: 1GB free space
- Go: 1.19+ (for building from source)
- tiny: ~39MB, ~125MB RAM
- base: ~142MB, ~210MB RAM
- small: ~466MB, ~550MB RAM
- medium: ~1.5GB, ~2GB RAM
- large: ~2.9GB, ~4GB RAM
This project is licensed under the MIT License - see the LICENSE file for details.