Skip to content

Madh93/whisper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Whisper

MIT license

Personal Makefile that provides a set of commands to manage the transcription and conversion process of audio files using whisper.cpp. It supports both Docker-based and native execution.

Requirements

Usage

Clone the repository and initialize the required dependencies:

make setup

Optionally, if you want AMD ROCm support to use your AMD GPU* just run:

WHISPER_HIPBLAS=1 make setup

*If your GPU is not officially supported don't forget to set the HSA_OVERRIDE_GFX_VERSION environment variable. More info here.

Download models

Downloads the necessary models for transcription:

make download

Download specific model (available model here):

make download model=tiny

By default, it uses Docker. To disable Docker:

DOCKER_ENABLED=no make download model=tiny

Convert to .wav (optional)

Converts an input audio file to WAV format (currently whisper.cpp runs only with 16-bit WAV files, so make sure to convert your input before running the tool):

make convert-to-wav input=audios/jfk.mp3 output=audios/jfk.wav

Transcribe audio

Transcribes the .wav audio file under audios directory using the specified model and language:

make transcribe model=small.en lang=en file=audios/jfk.wav

By default, it utilizes Docker for transcription. To opt for native execution:

DOCKER_ENABLED=no make transcribe model=small.en lang=en file=audios/jfk.wav

To run in your unsupported AMD GPU, just override the LLVM target. Example:

HSA_OVERRIDE_GFX_VERSION=10.3.0 DOCKER_ENABLED=no make transcribe model=small.en lang=en file=audios/jfk.wav

All methods generate .srt, .lrt and .txt transcription files.

Convert to video

Converts the transcribed text into a video file with subtitles:

make convert-to-video input=audios/jfk.wav

Useful Links

License

This project is licensed under the MIT license.