🎙️ Transcriber

CLI for audio transcription tool powered by whisper.cpp, designed for real-time transcription.

✨ Features

🎯 Real-time transcription - Record and transcribe audio on the fly
🚀 Cross-platform - Works on Linux, macOS, and Windows
⚙️ Configurable - Flexible configuration options
🔧 Multiple models - Support for various Whisper model sizes
💻 CLI-friendly - Easy-to-use command-line interface

🚀 Quick Start

Prerequisites

Install whisper-cli
- Mac OS: brew install whisper-cpp
- Other OS: Follow the whisper.cpp installation guide
Install ffmpeg for audio recording
- Mac OS: brew install ffmpeg
- Linux: Use your package manager (e.g., apt install ffmpeg)
- Windows: Download from FFmpeg official site
Audio recording capabilities (microphone)
At least 4GB RAM (recommended for larger models)

Installation

Option 1: Download pre-built binary (Recommended)

Download the latest release for your platform from GitHub Releases:

One-line MacOS installation

sh -c "$(curl -fsSL https://raw.githubusercontent.com/nnanto/transcriber/main/scripts/install-macos.sh)"

Linux/macOS:

# Download and install 
curl -L https://github.com/nnanto/transcriber/releases/download/latest/transcriber-linux-amd64.tar.gz | tar -xz
sudo mv transcriber-* /usr/local/bin/transcriber

# Or for macOS
curl -L https://github.com/nnanto/transcriber/releases/download/latest/transcriber-darwin-amd64.tar.gz | tar -xz
sudo mv transcriber-* /usr/local/bin/transcriber

# Make executable
chmod +x /usr/local/bin/transcriber

Windows:

Download transcriber-windows-amd64.zip from releases page
Extract the ZIP file
Add the extracted folder to your PATH or move transcriber.exe to a folder in your PATH

Verify installation:

transcriber version

Option 2: Install from source

git clone https://github.com/nnanto/transcriber.git
cd transcriber
make install

Option 3: Build locally

git clone https://github.com/nnanto/transcriber.git
cd transcriber
make build

First Run

Install whisper-cli if not already done. See prerequisites
Download a Whisper model (required on first use):

transcriber download-model

You can specify custom model using --model option. Available models are found in the whisper.cpp HF models

You can also specify a custom model path in the configuration file.

Start transcribing:

transcriber run --output ./transcriptions

Check your configuration:

transcriber config

📖 Usage Guide

Commands Overview

Command	Description	Example
`run`	Record and transcribe in real-time	`transcriber run --duration 2m`
`process`	Process existing audio files	`transcriber process --input ./audio`
`config`	Show current configuration	`transcriber config`
`download`	Download Whisper models	`transcriber download-model --model large`
`stop`	Stop all running processes	`transcriber stop`
`version`	Show version info	`transcriber version`

Real-time Transcription

Start recording and transcribing immediately:

# Record for 30 minutes (default)
transcriber run

# Record for specific duration
transcriber run --duration 5m --output ./my-transcriptions

# Custom configuration
transcriber run --config ./custom-config --duration 1h

Model Management

Download and manage Whisper models:

# Download specific model
transcriber download-model --model ggml-large-v3-turbo-q5_0

Available models are found in the whisper.cpp HF models

⚙️ Configuration

The configuration file is automatically created at ~/.transcriber/config.json:

{
  "model_path": "~/.transcriber/models/ggml-large-v3-turbo-q5_0.bin",
  "language": "English",
  "temp_dir": "/tmp/transcriber",
  "output_format": "txt",
  "whisper_cmd": "whisper-cli",
  "recording_cmd": "ffmpeg",
  "chunk_duration_in_secs": 30,
  "min_required_unique_word_count": 5
}

Configuration Options

model_path: Path to the Whisper model file
language: Language for transcription (e.g., "English", "Spanish", "auto")
temp_dir: Directory for temporary audio files during processing
output_format: Output format (txt, json)
whisper_cmd: Command to use for Whisper transcription (default: "whisper-cli")
recording_cmd: Command to use for audio recording (default: "ffmpeg")
chunk_duration_in_secs: Duration in seconds for each audio chunk during real-time transcription (default: 30)
min_required_unique_word_count: Minimum number of unique words required to process a chunk (default: 5)

🛠️ Development Guide

Project Structure

transcriber/
├── cmd.go              # CLI command handling
├── main.go             # Application entry point
├── transcriber.go      # Core transcription logic
├── config.go           # Configuration management
├── audio.go            # Audio recording/processing
├── models.go           # Model download/management
├── Makefile            # Build automation
└── README.md           # This file

Building from Source

# Clone the repository
git clone https://github.com/nnanto/transcriber.git
cd transcriber

# Install dependencies
go mod download

# Build for development (with race detection)
make dev

# Build for production
make build

# Run tests
make test

# Build for all platforms
make release-local

🐛 Troubleshooting

Common Issues

"Model not found" error

# Download the required model first
transcriber download-model [--model ggml-base]

Permission denied on macOS/Linux

# Make sure the binary is executable
chmod +x transcriber

# Or install system-wide
make install

High CPU usage

Try using a smaller model (ggml-tiny or ggml-base)
Reduce recording quality in config
Limit recording duration

Audio recording issues

Check microphone permissions
Verify audio device availability
Test with shorter durations first

Getting Help

Check the Issues page
Review configuration with transcriber config
Enable verbose logging in development builds

📋 System Requirements

Minimum Requirements

OS: Linux, macOS 10.14+, Windows 10+
RAM: 2GB (4GB recommended)
Storage: 1GB free space
Go: 1.19+ (for building from source)

Model Size Requirements

tiny: ~39MB, ~125MB RAM
base: ~142MB, ~210MB RAM
small: ~466MB, ~550MB RAM
medium: ~1.5GB, ~2GB RAM
large: ~2.9GB, ~4GB RAM

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Made with ❤️ by the Transcriber team

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
cmd.go		cmd.go
go.mod		go.mod
recorder.go		recorder.go
setup.go		setup.go
transcriber.go		transcriber.go
whisper.go		whisper.go

License

nnanto/transcriber

Folders and files

Latest commit

History

Repository files navigation

🎙️ Transcriber

✨ Features

🚀 Quick Start

Prerequisites

Installation

Option 1: Download pre-built binary (Recommended)

Option 2: Install from source

Option 3: Build locally

First Run

📖 Usage Guide

Commands Overview

Real-time Transcription

Model Management

⚙️ Configuration

Configuration Options

🛠️ Development Guide

Project Structure

Building from Source

🐛 Troubleshooting

Common Issues

"Model not found" error

Permission denied on macOS/Linux

High CPU usage

Audio recording issues

Getting Help

📋 System Requirements

Minimum Requirements

Model Size Requirements

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages