Skip to content

Audio transcription tool powered by whisper.cpp, designed for real-time transcription. NO API/CLOUD

License

Notifications You must be signed in to change notification settings

nnanto/transcriber

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

19 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸŽ™οΈ Transcriber

CLI for audio transcription tool powered by whisper.cpp, designed for real-time transcription.

GitHub Stars

Go Version License Platform

✨ Features

  • 🎯 Real-time transcription - Record and transcribe audio on the fly
  • πŸš€ Cross-platform - Works on Linux, macOS, and Windows
  • βš™οΈ Configurable - Flexible configuration options
  • πŸ”§ Multiple models - Support for various Whisper model sizes
  • πŸ’» CLI-friendly - Easy-to-use command-line interface

πŸš€ Quick Start

Prerequisites

  • Install whisper-cli
  • Install ffmpeg for audio recording
    • Mac OS: brew install ffmpeg
    • Linux: Use your package manager (e.g., apt install ffmpeg)
    • Windows: Download from FFmpeg official site
  • Audio recording capabilities (microphone)
  • At least 4GB RAM (recommended for larger models)

Installation

Option 1: Download pre-built binary (Recommended)

Download the latest release for your platform from GitHub Releases:

One-line MacOS installation

sh -c "$(curl -fsSL https://raw.githubusercontent.com/nnanto/transcriber/main/scripts/install-macos.sh)"

Linux/macOS:

# Download and install 
curl -L https://github.com/nnanto/transcriber/releases/download/latest/transcriber-linux-amd64.tar.gz | tar -xz
sudo mv transcriber-* /usr/local/bin/transcriber

# Or for macOS
curl -L https://github.com/nnanto/transcriber/releases/download/latest/transcriber-darwin-amd64.tar.gz | tar -xz
sudo mv transcriber-* /usr/local/bin/transcriber

# Make executable
chmod +x /usr/local/bin/transcriber

Windows:

  1. Download transcriber-windows-amd64.zip from releases page
  2. Extract the ZIP file
  3. Add the extracted folder to your PATH or move transcriber.exe to a folder in your PATH

Verify installation:

transcriber version

Option 2: Install from source

git clone https://github.com/nnanto/transcriber.git
cd transcriber
make install

Option 3: Build locally

git clone https://github.com/nnanto/transcriber.git
cd transcriber
make build

First Run

  1. Install whisper-cli if not already done. See prerequisites

  2. Download a Whisper model (required on first use):

transcriber download-model

You can specify custom model using --model option. Available models are found in the whisper.cpp HF models

You can also specify a custom model path in the configuration file.

  1. Start transcribing:
transcriber run --output ./transcriptions
  1. Check your configuration:
transcriber config

πŸ“– Usage Guide

Commands Overview

Command Description Example
run Record and transcribe in real-time transcriber run --duration 2m
process Process existing audio files transcriber process --input ./audio
config Show current configuration transcriber config
download Download Whisper models transcriber download-model --model large
stop Stop all running processes transcriber stop
version Show version info transcriber version

Real-time Transcription

Start recording and transcribing immediately:

# Record for 30 minutes (default)
transcriber run

# Record for specific duration
transcriber run --duration 5m --output ./my-transcriptions

# Custom configuration
transcriber run --config ./custom-config --duration 1h

Model Management

Download and manage Whisper models:

# Download specific model
transcriber download-model --model ggml-large-v3-turbo-q5_0

Available models are found in the whisper.cpp HF models

βš™οΈ Configuration

The configuration file is automatically created at ~/.transcriber/config.json:

{
  "model_path": "~/.transcriber/models/ggml-large-v3-turbo-q5_0.bin",
  "language": "English",
  "temp_dir": "/tmp/transcriber",
  "output_format": "txt",
  "whisper_cmd": "whisper-cli",
  "recording_cmd": "ffmpeg",
  "chunk_duration_in_secs": 30,
  "min_required_unique_word_count": 5
}

Configuration Options

  • model_path: Path to the Whisper model file
  • language: Language for transcription (e.g., "English", "Spanish", "auto")
  • temp_dir: Directory for temporary audio files during processing
  • output_format: Output format (txt, json)
  • whisper_cmd: Command to use for Whisper transcription (default: "whisper-cli")
  • recording_cmd: Command to use for audio recording (default: "ffmpeg")
  • chunk_duration_in_secs: Duration in seconds for each audio chunk during real-time transcription (default: 30)
  • min_required_unique_word_count: Minimum number of unique words required to process a chunk (default: 5)

πŸ› οΈ Development Guide

Project Structure

transcriber/
β”œβ”€β”€ cmd.go              # CLI command handling
β”œβ”€β”€ main.go             # Application entry point
β”œβ”€β”€ transcriber.go      # Core transcription logic
β”œβ”€β”€ config.go           # Configuration management
β”œβ”€β”€ audio.go            # Audio recording/processing
β”œβ”€β”€ models.go           # Model download/management
β”œβ”€β”€ Makefile            # Build automation
└── README.md           # This file

Building from Source

# Clone the repository
git clone https://github.com/nnanto/transcriber.git
cd transcriber

# Install dependencies
go mod download

# Build for development (with race detection)
make dev

# Build for production
make build

# Run tests
make test

# Build for all platforms
make release-local

πŸ› Troubleshooting

Common Issues

"Model not found" error

# Download the required model first
transcriber download-model [--model ggml-base]

Permission denied on macOS/Linux

# Make sure the binary is executable
chmod +x transcriber

# Or install system-wide
make install

High CPU usage

  • Try using a smaller model (ggml-tiny or ggml-base)
  • Reduce recording quality in config
  • Limit recording duration

Audio recording issues

  • Check microphone permissions
  • Verify audio device availability
  • Test with shorter durations first

Getting Help

  • Check the Issues page
  • Review configuration with transcriber config
  • Enable verbose logging in development builds

πŸ“‹ System Requirements

Minimum Requirements

  • OS: Linux, macOS 10.14+, Windows 10+
  • RAM: 2GB (4GB recommended)
  • Storage: 1GB free space
  • Go: 1.19+ (for building from source)

Model Size Requirements

  • tiny: ~39MB, ~125MB RAM
  • base: ~142MB, ~210MB RAM
  • small: ~466MB, ~550MB RAM
  • medium: ~1.5GB, ~2GB RAM
  • large: ~2.9GB, ~4GB RAM

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


Made with ❀️ by the Transcriber team

About

Audio transcription tool powered by whisper.cpp, designed for real-time transcription. NO API/CLOUD

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •