-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running Noscribe 0.5 with nvidia 4080 - 20min for a 30min WAV file or 2minutes - depending on mouse down on title bar #105
Comments
This is really strange. We had a somewhat similar observation a few months back: #50 (comment) I cannot reproduce this behavior on my machine. But I also don't have an NVIDIA graphics card, which may be the reason. If you find anything that we can do to improve the performance, let me know (but I don't want to steal mouse control from the user...). |
I tried another variant: This time an older PC (threadipper 1950x, Nvidia 3050, 64GB RAM, Server 2022, no hyper-V Role installed). |
It would be interesting to know if this is really related to CUDA (which I suspect). You could temporarily disable CUDA by editing the file Command line:
Note that this only works with wav files. If you want to run faster-whisper from command line, you can try this: |
Thank you, |
That's very fast. Are you sure it didn't crash silently? Please check if the output file contains data.
I don't think that this is the problem, for several reasons:
|
The yaml file looks fine. Also, If I let it run through the GUI (with mouse down), the time used for processing is similar, and the resulting transcription makes sense regarding the recognition of the speakers. |
It would be great if you could investigate this a little more. You can run noScribe directly from the python source, no need to "build" it first. I know aTrain, it's very similar, but misses some features of noScribe (no editor, no marking of pauses or overlapping speech - but you might not need that anyway). It's made by some colleagues from Austria. I don't know why they reinvented the wheel instead of collaborating, but yeah. |
I will try to dig further. |
I did a new windows installation on the PC which has the AMD1950x cpu with the Nvidia 3050 card. This time, I installed a vanilla windows 11. Nothing else. no Windows domain, no GPO rules. Also, onboard audio on. |
I know this is about the focus (so maybe more about the old thread), but this might still be relevant: |
I got it fixed/or rather made a workaround, Now I have 2minutes, also when I just let the application run. CPU is now always >50% (if involved in the transcription) and also the Nvidia card shows permanent load. |
Interesting, thanks for investigating. Could you show me the exact code changes? |
in noScribe.py: in diarize.py:
replaced with:
Saves about 8minutes I also changed in line 344:
before there was 650, to fix the annoying problem I have to resize the windows every time I start the program. It seems that my initial assumption about something not able to consume the logs fast enough might be true? I dont think that python is just slow, but maye it is related to the context switches which have to occur every time an output is generated? But then my knowledge about python is rather limited, I am not even used to the language. |
I fail to see the change in
``` EDIT: Ah, I've found the difference ( |
Great. Thanks to your detailed investigation, I might have found the issue. For some reason, updating the progress bar takes much more time than it should. If I leave everything like stock (all hooks in place) and just disable the progress bar update, I consistently gain around 10% percent in speed on my non-cuda system, which is quite surprising. To disable the progress bar update, go to line 659 and add
Could you test this on your system? Remember to revert the other code changes back to stock. |
Yes, I can confirm the progress bar is the main culprit. |
I wonder, if it might be worth decoupling the two processes (diarize and whisper) completely from the GUI and run them as subprocess to save resources? |
@phb911: Thank you for testing. It's really strange that updating the progress bar is bogging down the system so much. But I'm happy that we can leave the other user-feedback in place (logging to the screen) and still have the performance gains. I will remove the progress bar altogether in the next release, which is planned for December. Regarding other performance optimizations: Running such AI models is a complex task, and only certain operations benefit from GPU-acceleration. PyAnnote and faster-whisper both rely on the pytorch library (developed and maintained by Meta) which is heavily optimized for CUDA-support. I don't think that there is much potential for further improvements. @gernophil: I'm not completely sure what you mean by running them as subprocesses. The way we do this right now with the compiled diarize.exe is consuming more resources instead of less: a second instance of the python interpreter is loaded, together with heavy libraries... We had to do this for compatibility reasons, but when it comes to resources, it's not optimal. |
Strange issue.
When I run a sample file through noscribe, and just let it do its work it takes 20min für a 30min wav file.
But, when I just hold down the mouse on the tiltle bar of noscribe, it takes 2 minutes!
This system is a little bit different from a standard PC. because it is running W10 inside a VM hosted by Server 2022. The Nvidia card is routed to the VM by PCI passthrough (hyperV). CPU is a Intel I9-14900K. 64GB Ram. It currently is the only VM on this hyper-V host.
I also tested if it makes a difference if I use a remote control software to access the noscribe VM, or if I use rdp or the hyper-V console. Makes no difference.
What can be done to make this fantastic piece of software run with expected performance?
The text was updated successfully, but these errors were encountered: