Missing sentences in Fine-Tuned whisper-medium CT2 model when audio exceeds 30 seconds #1298

VamsiMarriwada · 2025-05-13T10:06:18Z

VamsiMarriwada
May 13, 2025

I'm experiencing an issue with my fine-tuned whisper-medium model after converting it to CT2 format for use with faster-whisper. When transcribing audio files longer than 30 seconds, some sentences are consistently missing from the output. However, the same audio files work perfectly with the original model (before CT2 conversion).
I've already tried adjusting the chunk_length parameter and increasing it, but this actually makes the transcription quality worse, with up to half of the sentences missing.
Is there a way to automatically chunk audio files into segments under 30 seconds and then combine the transcriptions at the end? Or are there other parameters I should adjust to fix this issue with longer audio files in the CT2 format?
Thank you for any suggestions!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Missing sentences in Fine-Tuned whisper-medium CT2 model when audio exceeds 30 seconds #1298

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Missing sentences in Fine-Tuned whisper-medium CT2 model when audio exceeds 30 seconds #1298

Uh oh!

VamsiMarriwada May 13, 2025

Replies: 0 comments

VamsiMarriwada
May 13, 2025