Personal Makefile that provides a set of commands to manage the transcription and conversion process of audio files using whisper.cpp. It supports both Docker-based and native execution.
Clone the repository and initialize the required dependencies:
make setup
Optionally, if you want AMD ROCm support to use your AMD GPU* just run:
WHISPER_HIPBLAS=1 make setup
*If your GPU is not officially supported don't forget to set the HSA_OVERRIDE_GFX_VERSION
environment variable. More info here.
Downloads the necessary models for transcription:
make download
Download specific model (available model here):
make download model=tiny
By default, it uses Docker. To disable Docker:
DOCKER_ENABLED=no make download model=tiny
Converts an input audio file to WAV format (currently whisper.cpp
runs only with 16-bit WAV files, so make sure to convert your input before running the tool):
make convert-to-wav input=audios/jfk.mp3 output=audios/jfk.wav
Transcribes the .wav
audio file under audios
directory using the specified model and language:
make transcribe model=small.en lang=en file=audios/jfk.wav
By default, it utilizes Docker for transcription. To opt for native execution:
DOCKER_ENABLED=no make transcribe model=small.en lang=en file=audios/jfk.wav
To run in your unsupported AMD GPU, just override the LLVM target. Example:
HSA_OVERRIDE_GFX_VERSION=10.3.0 DOCKER_ENABLED=no make transcribe model=small.en lang=en file=audios/jfk.wav
All methods generate .srt
, .lrt
and .txt
transcription files.
Converts the transcribed text into a video file with subtitles:
make convert-to-video input=audios/jfk.wav
This project is licensed under the MIT license.