Skip to content

Takes a WAV file and SRT subtitle file and splits it based on the SRT, helps making diarization & voice datasets (for WHISPERX)

Notifications You must be signed in to change notification settings

SicariusSicariiStuff/Segment_WAV_by_SRT_WHISPERX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Splits a WAV or video file based on a subtitle SRT file, made for usage with WHISPERX to help with diarization

Usage:

segment_WAV_by_SRT.py WAV_file.wav SRT_file.srt 
segment_WAV_by_SRT.py some_video.mp4 SRT_file.srt 

BULK processing, spliting ALL *.wav files in the same dir, based on SRT files with the same names:

segment_WAV_by_SRT.py --all

For example, you have the files test1.wav, test1.srt, test2.wav, test2.srt... etc...

--all

will split ALL of them, and create folders with each speaker based on the SRT file from WHISPERX.

About

Takes a WAV file and SRT subtitle file and splits it based on the SRT, helps making diarization & voice datasets (for WHISPERX)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages