Skip to content

arcaartem/record-and-transcribe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸŽ™οΈ Whisper Audio Recorder & Transcriber

A comprehensive Bash script that records audio using sox and transcribes it using offline Whisper AI. Perfect for creating meeting notes, voice memos, and transcriptions without requiring an internet connection.

✨ Features

  • Interactive Recording: Start/stop recording with simple prompts
  • Offline Transcription: Uses OpenAI's Whisper for local speech-to-text
  • High-Quality Audio: Optimized recording settings for best transcription results
  • Multiple Whisper Models: Choose from tiny to large models based on your needs
  • Organized Output: Automatic file organization with timestamps
  • Error Handling: Robust validation and meaningful error messages
  • Cross-Platform: Works on macOS, Linux, and Windows (with WSL)

πŸš€ Quick Start

1. One-Command Setup

./setup.sh

2. Start Recording

./record-and-transcribe.sh

3. Record Your Audio

  • Press Enter when you want to stop recording
  • The script will automatically transcribe your audio
  • View the transcription on screen and find saved files in the output directories

πŸ“‹ Prerequisites

  • macOS (Darwin) - Optimized for macOS, adaptable for other systems
  • Homebrew - Package manager for macOS
  • Python 3.7+ - Required for Whisper
  • Microphone - For audio input

πŸ”§ Manual Installation

If you prefer to install dependencies manually:

Install sox (audio recording)

brew install sox

Install Whisper (speech-to-text)

pip install openai-whisper

Optional: Install ffmpeg (additional audio format support)

brew install ffmpeg

πŸ“– Usage

Basic Usage

./record-and-transcribe.sh

Advanced Options

# Use different Whisper model
./record-and-transcribe.sh --model large

# Custom output directory
./record-and-transcribe.sh --output ~/my-recordings

# Custom sample rate
./record-and-transcribe.sh --rate 44100

# Combine options
./record-and-transcribe.sh -m small -o ~/transcriptions -r 22050

Command Line Options

Option Description Default
-m, --model Whisper model (tiny|base|small|medium|large) base
-o, --output Output directory for files current directory
-r, --rate Audio sample rate in Hz 16000
-h, --help Show help message -

Environment Variables

You can set default values using environment variables:

export WHISPER_MODEL=large
export SAMPLE_RATE=22050
export OUTPUT_DIR=~/recordings
./record-and-transcribe.sh

🎯 Whisper Models

Choose the right model for your needs:

Model Size Speed Quality Best For
tiny 39MB ⚑⚑⚑⚑⚑ ⭐⭐ Quick drafts, testing
base 74MB ⚑⚑⚑⚑ ⭐⭐⭐ Recommended for most users
small 244MB ⚑⚑⚑ ⭐⭐⭐⭐ Better accuracy
medium 769MB ⚑⚑ ⭐⭐⭐⭐⭐ High accuracy needs
large 1550MB ⚑ ⭐⭐⭐⭐⭐ Maximum accuracy

Note: First run downloads the selected model. Subsequent runs are much faster.

πŸ“ File Organization

The script automatically organizes files:

whisper/
β”œβ”€β”€ record-and-transcribe.sh    # Main script
β”œβ”€β”€ setup.sh                    # Setup script
β”œβ”€β”€ recordings/                 # Audio files (.wav)
β”‚   └── recording_20240101_143022.wav
β”œβ”€β”€ transcriptions/             # Text files (.txt)
β”‚   └── transcription_20240101_143022.txt
└── memory-bank/               # Documentation
    β”œβ”€β”€ projectbrief.md
    β”œβ”€β”€ techContext.md
    └── activeContext.md

🎚️ Audio Settings

The script uses optimized settings for Whisper:

  • Sample Rate: 16kHz (Whisper's preferred rate)
  • Format: WAV (uncompressed for best quality)
  • Channels: Mono (sufficient for speech)
  • Bit Depth: 16-bit (balanced quality/size)

πŸ” Troubleshooting

Common Issues

"sox: command not found"

brew install sox

"whisper: command not found"

pip install openai-whisper
# or
pip3 install openai-whisper

Permission denied

chmod +x record-and-transcribe.sh
chmod +x setup.sh

No audio input detected

  • Check microphone permissions in System Preferences > Security & Privacy > Microphone
  • Test microphone with other applications
  • Try different audio input device: sox -d --show-device

Whisper model download fails

  • Check internet connection (required for first model download)
  • Try a smaller model first: ./record-and-transcribe.sh -m tiny
  • Manual download: whisper --model base /dev/null (downloads base model)

Low transcription quality

  • Use a better microphone or reduce background noise
  • Try a larger Whisper model: --model large
  • Ensure clear speech and proper distance from microphone
  • Check audio file quality in recordings/ directory

Performance Tips

  1. Choose the right model: Start with base, upgrade to large if needed
  2. Optimal recording environment: Quiet room, good microphone
  3. Recording distance: 6-12 inches from microphone
  4. File management: Regularly clean old recordings to save disk space

🌍 Language Support

Whisper supports 99+ languages. The script auto-detects language, but you can specify:

# Add language parameter to whisper command in script
whisper audio.wav --language en --model base

Common language codes: en (English), es (Spanish), fr (French), de (German), it (Italian), pt (Portuguese), ja (Japanese), zh (Chinese)

πŸ”’ Privacy & Offline Use

  • Completely Offline: No data sent to external servers
  • Local Processing: All transcription happens on your machine
  • Your Data Stays Private: Audio and transcriptions remain on your device

🀝 Contributing

Contributions welcome! Areas for improvement:

  • Additional audio format support
  • GUI interface
  • Real-time transcription
  • Speaker diarization
  • Batch processing

πŸ“„ License

This project is open source. Feel free to modify and distribute.

πŸ†˜ Support

If you encounter issues:

  1. Check the troubleshooting section above
  2. Ensure all dependencies are properly installed
  3. Test with a simple recording first
  4. Check that your microphone works with other applications

Happy Recording! πŸŽ™οΈβœ¨

About

A Bash script that records and transcribes speech with OpenAI's Whisper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages