A comprehensive Bash script that records audio using sox and transcribes it using offline Whisper AI. Perfect for creating meeting notes, voice memos, and transcriptions without requiring an internet connection.
- Interactive Recording: Start/stop recording with simple prompts
- Offline Transcription: Uses OpenAI's Whisper for local speech-to-text
- High-Quality Audio: Optimized recording settings for best transcription results
- Multiple Whisper Models: Choose from tiny to large models based on your needs
- Organized Output: Automatic file organization with timestamps
- Error Handling: Robust validation and meaningful error messages
- Cross-Platform: Works on macOS, Linux, and Windows (with WSL)
./setup.sh./record-and-transcribe.sh- Press Enter when you want to stop recording
- The script will automatically transcribe your audio
- View the transcription on screen and find saved files in the output directories
- macOS (Darwin) - Optimized for macOS, adaptable for other systems
- Homebrew - Package manager for macOS
- Python 3.7+ - Required for Whisper
- Microphone - For audio input
If you prefer to install dependencies manually:
brew install soxpip install openai-whisperbrew install ffmpeg./record-and-transcribe.sh# Use different Whisper model
./record-and-transcribe.sh --model large
# Custom output directory
./record-and-transcribe.sh --output ~/my-recordings
# Custom sample rate
./record-and-transcribe.sh --rate 44100
# Combine options
./record-and-transcribe.sh -m small -o ~/transcriptions -r 22050| Option | Description | Default |
|---|---|---|
-m, --model |
Whisper model (tiny|base|small|medium|large) | base |
-o, --output |
Output directory for files | current directory |
-r, --rate |
Audio sample rate in Hz | 16000 |
-h, --help |
Show help message | - |
You can set default values using environment variables:
export WHISPER_MODEL=large
export SAMPLE_RATE=22050
export OUTPUT_DIR=~/recordings
./record-and-transcribe.shChoose the right model for your needs:
| Model | Size | Speed | Quality | Best For |
|---|---|---|---|---|
| tiny | 39MB | β‘β‘β‘β‘β‘ | ββ | Quick drafts, testing |
| base | 74MB | β‘β‘β‘β‘ | βββ | Recommended for most users |
| small | 244MB | β‘β‘β‘ | ββββ | Better accuracy |
| medium | 769MB | β‘β‘ | βββββ | High accuracy needs |
| large | 1550MB | β‘ | βββββ | Maximum accuracy |
Note: First run downloads the selected model. Subsequent runs are much faster.
The script automatically organizes files:
whisper/
βββ record-and-transcribe.sh # Main script
βββ setup.sh # Setup script
βββ recordings/ # Audio files (.wav)
β βββ recording_20240101_143022.wav
βββ transcriptions/ # Text files (.txt)
β βββ transcription_20240101_143022.txt
βββ memory-bank/ # Documentation
βββ projectbrief.md
βββ techContext.md
βββ activeContext.md
The script uses optimized settings for Whisper:
- Sample Rate: 16kHz (Whisper's preferred rate)
- Format: WAV (uncompressed for best quality)
- Channels: Mono (sufficient for speech)
- Bit Depth: 16-bit (balanced quality/size)
brew install soxpip install openai-whisper
# or
pip3 install openai-whisperchmod +x record-and-transcribe.sh
chmod +x setup.sh- Check microphone permissions in System Preferences > Security & Privacy > Microphone
- Test microphone with other applications
- Try different audio input device:
sox -d --show-device
- Check internet connection (required for first model download)
- Try a smaller model first:
./record-and-transcribe.sh -m tiny - Manual download:
whisper --model base /dev/null(downloads base model)
- Use a better microphone or reduce background noise
- Try a larger Whisper model:
--model large - Ensure clear speech and proper distance from microphone
- Check audio file quality in recordings/ directory
- Choose the right model: Start with
base, upgrade tolargeif needed - Optimal recording environment: Quiet room, good microphone
- Recording distance: 6-12 inches from microphone
- File management: Regularly clean old recordings to save disk space
Whisper supports 99+ languages. The script auto-detects language, but you can specify:
# Add language parameter to whisper command in script
whisper audio.wav --language en --model baseCommon language codes: en (English), es (Spanish), fr (French), de (German), it (Italian), pt (Portuguese), ja (Japanese), zh (Chinese)
- Completely Offline: No data sent to external servers
- Local Processing: All transcription happens on your machine
- Your Data Stays Private: Audio and transcriptions remain on your device
Contributions welcome! Areas for improvement:
- Additional audio format support
- GUI interface
- Real-time transcription
- Speaker diarization
- Batch processing
This project is open source. Feel free to modify and distribute.
If you encounter issues:
- Check the troubleshooting section above
- Ensure all dependencies are properly installed
- Test with a simple recording first
- Check that your microphone works with other applications
Happy Recording! ποΈβ¨