You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+30-40Lines changed: 30 additions & 40 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,63 +12,53 @@ This implementation is up to 4 times faster than [openai/whisper](https://github
12
12
13
13
For reference, here's the time and memory usage that are required to transcribe [**13 minutes**](https://www.youtube.com/watch?v=0u7tTptBo9I) of audio using different implementations:
For `distil-whisper/distil-large-v2`, the WER is tested with code sample from [link](https://huggingface.co/distil-whisper/distil-large-v2#evaluation). for `faster-distil-whisper`, the WER is tested with setting:
57
-
```python
58
-
from faster_whisper import WhisperModel
54
+
*Executed with 8 threads on an Intel Core i7-12700K.*
59
55
60
-
model_size ="distil-large-v2"
61
-
# model_size = "distil-medium.en"
62
-
# Run on GPU with FP16
63
-
model = WhisperModel(model_size, device="cuda", compute_type="float16")
64
-
segments, info = model.transcribe("audio.mp3", beam_size=5, language="en")
65
-
```
66
-
</details>
67
56
68
57
## Requirements
69
58
70
59
* Python 3.8 or greater
71
60
61
+
Unlike openai-whisper, FFmpeg does **not** need to be installed on the system. The audio is decoded with the Python library [PyAV](https://github.com/PyAV-Org/PyAV) which bundles the FFmpeg libraries in its package.
0 commit comments