Replies: 2 comments 2 replies
-
Actually, that's the first time I am hearing of MLX. Gonna do some digging on this. Really seems much faster then PyTorch with MPS. If I get this correctly MLX completely replaces PyTorch. Guess we would then need a completely different version for macOS and windows/linux? |
Beta Was this translation helpful? Give feedback.
-
Also from my side: Thank you for making me aware of MLX! Do you have mlx-whisper running on your system? I would be interested in a direct comparison to see how much speed improvements are really possible. We are already using "faster-whisper" which is, as the name suggests, a lot faster than the OpenAI-implementation. If you want to make a direct comparison with CUDA, here are some benchmarks: #50 People were using Die Schatzinsel (342 min) as a test file. Turn off speaker detection in noScribe to make sure that only whisper is running. As @gernophil already mentioned: Implementing this in noScribe would mean that the Mac and Windows versions of noScribe would deviate a lot from each other. I am a little hesitant to do that step without knowing that it really makes a massive difference in speed in real world use. |
Beta Was this translation helpful? Give feedback.
-
I experimented with whisper models converted to the MLX format, because the current implementation does not benefit much from the Apple Silicon architecture.
With the Python package mlx-whisper and a mlx converted whisper model like mlx-community/whisper-large-v2-mlx the transcription speed come close to cuda implementations.
Have you ever thought about adding MLX support? If you are interested i could try to add mlx-support to noScribe.
Beta Was this translation helpful? Give feedback.
All reactions