Use MLX whisper models on Apple Silicon Macs #103

wimueller · 2024-11-11T20:13:43Z

wimueller
Nov 11, 2024

I experimented with whisper models converted to the MLX format, because the current implementation does not benefit much from the Apple Silicon architecture.
With the Python package mlx-whisper and a mlx converted whisper model like mlx-community/whisper-large-v2-mlx the transcription speed come close to cuda implementations.

Have you ever thought about adding MLX support? If you are interested i could try to add mlx-support to noScribe.

gernophil · 2024-11-11T20:55:48Z

gernophil
Nov 11, 2024
Collaborator

Actually, that's the first time I am hearing of MLX. Gonna do some digging on this. Really seems much faster then PyTorch with MPS. If I get this correctly MLX completely replaces PyTorch. Guess we would then need a completely different version for macOS and windows/linux?

0 replies

kaixxx · 2024-11-12T08:03:48Z

kaixxx
Nov 12, 2024
Maintainer

Also from my side: Thank you for making me aware of MLX!

Do you have mlx-whisper running on your system? I would be interested in a direct comparison to see how much speed improvements are really possible. We are already using "faster-whisper" which is, as the name suggests, a lot faster than the OpenAI-implementation.

If you want to make a direct comparison with CUDA, here are some benchmarks: #50 People were using Die Schatzinsel (342 min) as a test file.

Turn off speaker detection in noScribe to make sure that only whisper is running.

As @gernophil already mentioned: Implementing this in noScribe would mean that the Mac and Windows versions of noScribe would deviate a lot from each other. I am a little hesitant to do that step without knowing that it really makes a massive difference in speed in real world use.

2 replies

wimueller Nov 12, 2024
Author

Yes i have the mlx-whisper running on my machine. I will post the benchmark results with the test file in this thread.

gernophil Nov 12, 2024
Collaborator

How speed limiting is whisper at all? Isn't pyannote/speaker recognition by far the slower step?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use MLX whisper models on Apple Silicon Macs #103

{{title}}

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Use MLX whisper models on Apple Silicon Macs #103

wimueller Nov 11, 2024

Replies: 2 comments · 2 replies

gernophil Nov 11, 2024 Collaborator

kaixxx Nov 12, 2024 Maintainer

wimueller Nov 12, 2024 Author

gernophil Nov 12, 2024 Collaborator

wimueller
Nov 11, 2024

Replies: 2 comments 2 replies

gernophil
Nov 11, 2024
Collaborator

kaixxx
Nov 12, 2024
Maintainer

wimueller Nov 12, 2024
Author

gernophil Nov 12, 2024
Collaborator