awk script to check for stuck subtitles #976
mrfragger
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
awk script to run to detect any stuck subtitles
for f in *.vtt *.srt ; do printf "%s\n" "$f" ; awk -F ":" '/[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3} -->/ {if ($2>$4 || $2<$4-1 ) print " Line# "NR": " $1":"$2" "$4 }' "$f" | awk '!/59 00/' ; done
Ran it on 254 audiobooks transcribed and 16 had this problem. Almost all were done with whisper.cpp 1.4.0 using medium.en model
This shows the output...if there is no problem then it just prints the filename of the audiobook subtitles. Just changed real titles to audiobook1, audiobook2 for demonstration purposes.
example
this one one needs to replace 56 min mark with 10 and the 29 min mark with 27 to correct subtitles from sticking
This script will ignore any problems if srt or vtt doesn't have an hour timecode for the first 60 mins
00:00.000 --> (this is rare and will be skipped the first 60 mins to check for stuck subs)
instead of
00:00:00.000 --> (almost all are like this as they should be)
Read more about issue here #975
Beta Was this translation helpful? Give feedback.
All reactions