Skip to content

Commit b9c8c50

Browse files
committed
Pad language detection if audio is too short
1 parent a903e57 commit b9c8c50

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

whisperx/asr.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -251,7 +251,10 @@ def data(audio, segments):
251251

252252

253253
def detect_language(self, audio: np.ndarray):
254-
segment = log_mel_spectrogram(audio[: N_SAMPLES], padding=0)
254+
if audio.shape[0] < N_SAMPLES:
255+
print("Warning: audio is shorter than 30s, language detection may be inaccurate.")
256+
segment = log_mel_spectrogram(audio[: N_SAMPLES],
257+
padding=0 if audio.shape[0] >= N_SAMPLES else N_SAMPLES - audio.shape[0])
255258
encoder_output = self.model.encode(segment)
256259
results = self.model.model.detect_language(encoder_output)
257260
language_token, language_probability = results[0][0]

0 commit comments

Comments
 (0)