Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installed this to use Existing WhisperX installation Conda (that uses my Nvidia GPU) and chose GPU option, but it only uses CPU #15

Open
cleverestx opened this issue Feb 25, 2024 · 18 comments
Labels
bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request

Comments

@cleverestx
Copy link

The Gui won't let me switch to CPU. Did I miss something? The instance of WhipserX from the command line IS using my Nvidia GPU.

I'm using Windows 11.

image

@Pikurrot
Copy link
Owner

Hi,
if you confirmed to use GPU when asked by the program, then this behavior is not expected. However, you should be able to solve it in the following way: check your config.json file inside the configs folder and make sure there is actually a field named "gpu_support" and is set to true, like this:

{
    "env_name": "<YOUR_ENVIRONMENT_NAME>",
    "gpu_support": true,
    "auto_update": true
}

Rerun the GUI and you should be able to select the "cuda" option for GPU.

Note that this will work if your environment was set up correctly and Pytorch for GPU is installed. So, if that didn't work, I suggest you select to create a new environment, as it's the recommended way. To do that, first delete the config.json file, then run the whisper-gui.bat file as if it was the first time.

Let me know if you have more trouble and, if so, attach the logs that appear in the terminal when you set up the program for the first time, so that I can know exactly when the bug appears.

I hope this helps.

@cleverestx
Copy link
Author

Thank you for the response, but strangely enough I don't appear to have a config.json file in that folder or anywhere else...I only have these two files in the config folder:

image

Because I lack it entirely as if I "deleted" it (as per your instructions above), I ran the whisper-gui.bat

and got this error/result:

image

??

@Pikurrot
Copy link
Owner

Seems an error when trying to install additional dependencies. I will look into it and try to recreate the error.

@cleverestx
Copy link
Author

Thanks. Let me know if you need anything else from me.

@Pikurrot Pikurrot added the bug Something isn't working label Feb 26, 2024
@Pikurrot
Copy link
Owner

Hi, I could replicate your error on Windows, should be solved now. I suggest you delete your config.json file if you had one, and run the whisper-gui.bat again selecting to create a new environment.

However, I realized some errors may raise due to the latest releases of some packages:

  • RuntimeError: Unsupported model binary version: this is easy to fix, just delete the downloaded whisper model (inside models/whisperx/ and let the GUI install the latest version of that model.
  • RuntimeError: Library cublas64_12.dll is not found or cannot be loaded: Unfortunately, this error seems to be a problem with the latest major release of CTranslate2, which allowed support for cuda 12 but now it seems to have problems with cuda 11. I hope they fix it soon, I will stay tunned.

For now, this last error only happens on Windows for me, I had no problems running the program on Linux. Maybe you could find a workaround with WSL.

@cleverestx
Copy link
Author

cleverestx commented Feb 26, 2024

Thanks for the details! I guess I'm stuck waiting for CTranslate2 to be fixed.... :-(

I can try WSL...I have Ubuntu installed, but I have WhisperX already working via command line in Windows. I guess I can reinstall everything there for now.

Same issue with my Ubuntu WSL installation....odd, I guess I'll just wait.

...also I don't have a models\whisperx\folder to delete a model in the Windows version (to fix the first thing you listed) but I'll re-run the command anyways, so I guess I'm just waiting in either case.

@Pikurrot
Copy link
Owner

I've posted an issue in the CTranslate2 repo: OpenNMT/CTranslate2#1630
Let's hope they fix it soon :)

@cleverestx
Copy link
Author

I saw this elsewhere.....if it helps...

SmartSelect_20240227_125709_Chrome

@Pikurrot
Copy link
Owner

Yeah, so basically, there appear to be 2 solutions:

  1. CTranslate2 only supports cuda 12.x now. So install the latest version of Nvidia CUDA. And that will already make it work, even with pytorch-cuda=11.8.
  2. Rename the file cublas64_11.dll to cublas64_12.dll in "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin". You can literally just do that and there is no need to install any newer cuda nor pytorch-cuda. However, I think this is a temporal fix and won't be useful in a future, but for now it's an option. Also, I'm not aware if this has other undesired consequences.

Let me know if that solved your problem.

@cleverestx
Copy link
Author

I think I've already tried number 1, and it still had a problem...but I'm gonna double check that. I will let you know!

@cleverestx
Copy link
Author

cleverestx commented Feb 28, 2024

Hmm, well I just ran whisper-gui.bat and it asked me to update, and enable auto-updates and then started up. I can select CUDA now, so I guess it's all working now! :-)

One more question..how do I set --task translate to work via this GUI for a given video/audio file? I use this command flag often for most of the videos I process through Whisper-X.

Thank you for all the help and info.

@Pikurrot
Copy link
Owner

I'm glad it's working for you now!
About --task translate, it's still not an available option in the GUI, but I will take it into account for a future update, thanks for pointing it out!

I'm closing this issue as it's been solved. Feel free to open a new issue for a new suggestion for this project or if you encounter any more bugs. Thanks.

@Pikurrot Pikurrot added documentation Improvements or additions to documentation enhancement New feature or request labels Feb 28, 2024
@chrisangel666
Copy link

Hello, your code is very good. I would like to ask how to add localization options. I want to translate the interface into Chinese.

@chrisangel666
Copy link

Also, when the subtitles and audio files are forced to align, if the wav2vec2 model of the corresponding language is selected for processing, will the effect be better than the current general model?

@chrisangel666
Copy link

Every time I run the .bat file, I get this message: "torchvision is not available - cannot save figures". What is going on? Is there anything I need to do?

@Pikurrot
Copy link
Owner

Hello, your code is very good. I would like to ask how to add localization options. I want to translate the interface into Chinese.

Hi, thank you!
I also wanted to add an option to change the language. We can work on that together, if you want, just tell me and we open a discussion page in this repo and talk about it.

Also, when the subtitles and audio files are forced to align, if the wav2vec2 model of the corresponding language is selected for processing, will the effect be better than the current general model?

Yes, I would say the transcription should be more accurate.

Every time I run the .bat file, I get this message: "torchvision is not available - cannot save figures". What is going on? Is there anything I need to do?

This is fine, not a problem. Just a warning you can avoid by installing torchvision, but you can safely ignore it.

@Pikurrot Pikurrot reopened this Jun 25, 2024
@chrisangel666
Copy link

Nice. How to add the corresponding wav2vec2 model to the options? Does it mean to modify the corresponding item in the code? If so, how?

@Pikurrot
Copy link
Owner

Wav2vec2 is an alternative, different model to Whisper. If we were to use it, we would need to add an option to choose between both, and modify some functions like _transcribe() and add a function transcribe_wav2vec2().

Are you interested in contributing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants