Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_pywhispercpp module could not be found #34

Open
kutal10 opened this issue Feb 27, 2024 · 57 comments
Open

_pywhispercpp module could not be found #34

kutal10 opened this issue Feb 27, 2024 · 57 comments

Comments

@kutal10
Copy link

kutal10 commented Feb 27, 2024

Just did a standard PyPi download in my venv as per

pip install pywhispercpp

A standard script with:

import pywhispercpp.model as m

modelPath: str = ...
filePath: str = ...
outputPath: str = ...

model = m.Model('modelPath', n_threads=6)
segments = model.transcribe(filePath, token_timestamps=True, max_len=1)

with open(outputPath, 'w') as file:
    for segment in segments:
        file.write(segment.text + '\n')

Is failing with error:

Traceback (most recent call last):
  File "...\whisper_file.py", line 1, in <module>
    import pywhispercpp.model as m
  File "...\model.py", line 13, in <module>
    import _pywhispercpp as pw
ImportError: DLL load failed while importing _pywhispercpp: The specified module could not be found.

For reference, FFMpeg is installed:

╰─ ffmpeg -version                                                                                                   ─╯
ffmpeg version 4.4-essentials_build-www.gyan.dev Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 10.2.0 (Rev6, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-lzma --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-sdl2 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-libass --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libgme --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libtheora --enable-libvo-amrwbenc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-librubberband
libavutil      56. 70.100 / 56. 70.100
libavcodec     58.134.100 / 58.134.100
libavformat    58. 76.100 / 58. 76.100
libavdevice    58. 13.100 / 58. 13.100
libavfilter     7.110.100 /  7.110.100
libswscale      5.  9.100 /  5.  9.100
libswresample   3.  9.100 /  3.  9.100
libpostproc    55.  9.100 / 55.  9.100
@abdeladim-s
Copy link
Owner

Seems like Python cannot see the DLL module for some reason, even though the wheels were built successfully for Windows.
I would suggest to try to build the package from source or use WSL.

@NBNGaming
Copy link

NBNGaming commented Mar 16, 2024

I confirm this problem exists. Building from source results in the same error.

@abdeladim-s
Copy link
Owner

@NBNGaming,
if building from source leads to the same error, then I think something's missing from your system, because the Winodws wheels have been successfully built using GitHub actions without any issue.
Please make sure that you have the GCC toolchain and you can compile whisper.cpp first without any problems.

@BBC-Esq
Copy link

BBC-Esq commented May 29, 2024

I'm getting the same exact error message regarding the .dll file...I searched the "lib" directory after pip installing and there are no .dll files within that directory. Any idea why?

I'm not an expert so forgive me, but am I supposed to build whisper.cpp first and then install pywhispercpp? ggerganov's repo? I'm unfamiliar with building but can learn, but need to know if I have to install whisper.cpp first please. Thanks!

BTW, I don't know what GCC toolchain means...

Windows 10
CUDA 12.5
Intel CPU

@abdeladim-s
Copy link
Owner

@BBC-Esq, I am not quite sure why this is happening on Windows!
Usually, you don't need to build whisper.cpp if you pip installed the package and basically it should work out of the box. However if the pre-built wheel for your system is not working then you might need to build the project from source.

But, before going through this path, here are some suggestions:

  1. If you are using Python 3.12, try to downgrade to 3.10 or 3.11. This might solve the issue.
  2. Otherwise, use WSL instead if you are unfamiliar with building, this will work without issues I assume.

@BBC-Esq
Copy link

BBC-Esq commented May 30, 2024

I'm using Python 3.11. Not familiar with WSL...any other ideas? Have you actually tested it on Windows?

@abdeladim-s
Copy link
Owner

Apart from the Github action which ran successfully, I didn't make any tests on Windows unfortunately, I only tested the project on Linux.
I just double checked now the pre-built Windows wheel for Python 3.11
pywhispercpp-1.2.0-cp311-cp311-win_amd64.whl and I can see that the dll .pyd file is there.
Could you please double check on your end ?

@BBC-Esq
Copy link

BBC-Esq commented May 30, 2024

Actually, it wasn't that hard to test and it gave me this error:

Traceback (most recent call last):
  File "D:\Scripts\benchmark_whisper\bench_whisper_cpp.py", line 1, in <module>
    from pywhispercpp.model import Model
  File "D:\Scripts\benchmark_whisper\Lib\site-packages\pywhispercpp\model.py", line 13, in <module>
    import _pywhispercpp as pw
ImportError: DLL load failed while importing _pywhispercpp: The specified module could not be found.

I did pip install and then the link to the wheel you gave me.

Here is the directory structure that another script of mine culled...Everything from the "pywhispercpp" directory downwards:

pywhispercpp/
    constants.py
    model.py
    utils.py
    _logger.py
    __init__.py
    examples/
        assistant.py
        livestream.py
        main.py
        recording.py
        __init__.py
        __pycache__/
            assistant.cpython-311.pyc
            livestream.cpython-311.pyc
            main.cpython-311.pyc
            recording.cpython-311.pyc
            __init__.cpython-311.pyc
    __pycache__/
        constants.cpython-311.pyc
        model.cpython-311.pyc
        utils.cpython-311.pyc
        _logger.cpython-311.pyc
        __init__.cpython-311.pyc

I noticed that the .pyd file is one level up within the "site-packages" folder though.

@BBC-Esq
Copy link

BBC-Esq commented May 30, 2024

Correction in case you didn't see my edit to the above message...The .pyd file is a directory higher in "site-packages".

@abdeladim-s
Copy link
Owner

@BBC-Esq, so it's there at least. Not sure why Windows cannot find it!!
Can you in that case put the pyd file inside the pywhispercpp directory ?
Or Maybe if didn't work try to put it also in your current working directory D:\Scripts\benchmark_whisper\ ?

@BBC-Esq
Copy link

BBC-Esq commented May 30, 2024

Sure, I'll put it in both at the same time.

@BBC-Esq
Copy link

BBC-Esq commented May 30, 2024

Same exact error as before. Might it be that you're importing "_pywhispercpp" with an underscore at the beginning instead of simply "pywhispercpp"?

@BBC-Esq
Copy link

BBC-Esq commented May 30, 2024

The script I'm using is very simple. I even modified it to add appending the system path but am still getting the same error...

import sys
import os
from pywhispercpp.model import Model

# Add the directory containing the .pyd file to the sys.path
sys.path.append(os.path.dirname(os.path.abspath(r"D:\Scripts\benchmark_whisper\Lib\site-packages\_pywhispercpp.cp311-win_amd64.pyd")))

model = Model('base.en', n_threads=6)
segments = model.transcribe(r"D:\Scripts\benchmark_whisper\test_audio_flac_converted.wav", speed_up=True)
for segment in segments:
    print(segment.text)

@abdeladim-s
Copy link
Owner

No, the _pywhispercpp is the extension module :)

Can you try with os.add_dll_directory instead of sys.path.append

@BBC-Esq
Copy link

BBC-Esq commented May 30, 2024

Hmm...same as before...

image

@abdeladim-s
Copy link
Owner

Hmm...same as before...

You should use os.add_dll... before the import!

@BBC-Esq
Copy link

BBC-Esq commented May 30, 2024

Same error as before and here's the modified script:

import os

dll_path = r"D:\Scripts\benchmark_whisper\_pywhispercpp.cp311-win_amd64.pyd"
dll_directory = os.path.dirname(dll_path)

os.add_dll_directory(dll_directory)

from pywhispercpp.model import Model

model = Model('base.en', n_threads=6)
segments = model.transcribe(r"D:\Scripts\benchmark_whisper\test_audio_flac_converted.wav", speed_up=True)

for segment in segments:
    print(segment.text)

I also tried this script:

import os
import ctypes

dll_path = r"D:\Scripts\benchmark_whisper\_pywhispercpp.cp311-win_amd64.pyd"
dll_directory = os.path.dirname(dll_path)

with os.add_dll_directory(dll_directory):
    ctypes.CDLL(dll_path)

    from pywhispercpp.model import Model

    model = Model('base.en', n_threads=6)
    segments = model.transcribe(r"D:\Scripts\benchmark_whisper\test_audio_flac_converted.wav", speed_up=True)

for segment in segments:
    print(segment.text)

It gave me a slightly different error:

Traceback (most recent call last):
  File "D:\Scripts\benchmark_whisper\bench_whisper_cpp.py", line 8, in <module>
    ctypes.CDLL(dll_path)
  File "C:\Users\Airflow\AppData\Local\Programs\Python\Python311\Lib\ctypes\__init__.py", line 376, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: Could not find module 'D:\Scripts\benchmark_whisper\_pywhispercpp.cp311-win_amd64.pyd' (or one of its dependencies). Try using the full path with constructor syntax.

NOTE: It says "or one of its dependencies"

@abdeladim-s
Copy link
Owner

Windows is weird to be honest!
if it complains about the dependencies then maybe you need Windows c++ redistributables or something! A lot of things can go wrong! Not quite sure what's the real issue.
That's why I suggested to use WSL!
In that case try to build whisper.cpp first and see if it works, you can find the instructions on their reop.

@BBC-Esq
Copy link

BBC-Esq commented May 30, 2024

I asked jeeves and he told me to try "dependency walker" from Microsoft but it didn't work...https://www.dependencywalker.com/

Then I tried https://github.com/lucasg/Dependencies/releases/tag/v1.11.1

The "file" "open" dialog only lets you select .dll files so you'll have to drag and drop the .pyd file...Anyways, this is what it gave me:

image

It seems that "whisper.dll" is missing? So basically, I need to install openai's whisper?

@abdeladim-s
Copy link
Owner

Great idea @BBC-Esq,
Here is the whisper.dll file from ggerganov/whisper.cpp repo.
Maybe include it in the same directroy as well, hopefully this will solve the issue

@BBC-Esq
Copy link

BBC-Esq commented May 30, 2024

Fucking A...it worked. I put it in the "site-packages" folder, the "benchmark_whisper" folder, and the "pywhispercpp" folder and it worked...now it's just a matter of narrowing it down to which folder hierarchy it should be in.

@BBC-Esq
Copy link

BBC-Esq commented May 30, 2024

...correction, it didn't actually work. It resolved that error, but now I'm getting "failed to compute log mel spectrogram:"

[2024-05-29 22:58:13,706] {utils.py:38} INFO - No download directory was provided, models will be downloaded to C:\Users\Airflow\AppData\Local\pywhispercpp\pywhispercpp\models
[2024-05-29 22:58:13,707] {utils.py:46} INFO - Model base.en already exists in C:\Users\Airflow\AppData\Local\pywhispercpp\pywhispercpp\models
[2024-05-29 22:58:13,707] {model.py:221} INFO - Initializing the model ...
whisper_init_from_file_with_params_no_state: loading model from 'C:\Users\Airflow\AppData\Local\pywhispercpp\pywhispercpp\models\ggml-base.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head  = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 512
whisper_model_load: n_text_head   = 8
whisper_model_load: n_text_layer  = 6
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 2 (base)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:      CPU buffer size =   147.46 MB
whisper_model_load: model size    =  147.37 MB
whisper_init_state: kv self size  =   16.52 MB
whisper_init_state: kv cross size =   18.43 MB
whisper_init_state: compute buffer (conv)   =   14.86 MB
whisper_init_state: compute buffer (encode) =   85.99 MB
whisper_init_state: compute buffer (cross)  =    4.78 MB
whisper_init_state: compute buffer (decode) =   96.48 MB
[2024-05-29 22:58:14,083] {model.py:130} INFO - Transcribing ...
whisper_full_with_state: failed to compute log mel spectrogram
[2024-05-29 22:58:14,083] {model.py:133} INFO - Inference time: 0.000 s

@abdeladim-s
Copy link
Owner

Finally! so that's what was missing! I should find a way to include it with the wheel!
For the failed to compute log mel spectrogram error, just remove speed_up True or set it to False!

@BBC-Esq
Copy link

BBC-Esq commented May 30, 2024

I/we fixed it! Mind explaining to me what whisper.dll is and why we need it? Also, I have cuda installed. How can I use gpu acceleration in my script?

import os
import ctypes

dll_directory = r"D:\Scripts\benchmark_whisper\Lib\site-packages"
ctypes.windll.kernel32.SetDllDirectoryW(dll_directory)

ctypes.CDLL(os.path.join(dll_directory, '_pywhispercpp.cp311-win_amd64.pyd'))

from pywhispercpp.model import Model

model = Model(r"D:\Scripts\benchmark_whisper\Models\ggml-large-v2-q5_0.bin", n_threads=6)
segments = model.transcribe(r"D:\Scripts\benchmark_whisper\test_audio_flac_converted.wav", speed_up=False)

for segment in segments:
    print(segment.text)

@BBC-Esq
Copy link

BBC-Esq commented May 30, 2024

...do tell me you've implemented the gpu acceleration like cuda, etc. in your repo? I don't see anything in the documentation about a cuda let alone any other gpu acceleration parameters... ;-)

@abdeladim-s
Copy link
Owner

The whisper.dll is very simply the file containing the executable code of whisper.cpp.
Unfortunately GPU is not supported at the moment, I didn't find time to add it!

@BBC-Esq
Copy link

BBC-Esq commented May 30, 2024

Damn, was looking forward to benchmarking cuda whisper.cpp. Anyways, fun little adventure tonight. Have a good one man. I'll follow new releases for when you include the .dll and/or support GPU. How hard can it be, to support cuda, vulkan, openblas, etc. ;-)

@abdeladim-s
Copy link
Owner

It shouldn't be so hard I assume, but it'll need time, this is the problem :)
But feel free to take a look at the code as well, PRs are always welcome!

@UsernamesLame
Copy link
Contributor

Damn, was looking forward to benchmarking cuda whisper.cpp. Anyways, fun little adventure tonight. Have a good one man. I'll follow new releases for when you include the .dll and/or support GPU. How hard can it be, to support cuda, vulkan, openblas, etc. ;-)

Hey, Vulkan support is here and CUDA!

@BBC-Esq
Copy link

BBC-Esq commented Aug 31, 2024

I just discovered this and plan to benchmark it finally - so excited!

@UsernamesLame
Copy link
Contributor

I just discovered this and plan to benchmark it finally - so excited!

Please do make a issue to post the benchmarks!

Also look into what I've been up to here: #49 (comment)

@abdeladim-s
Copy link
Owner

@BBC-Esq, looking forward it! 👍

@BBC-Esq
Copy link

BBC-Esq commented Aug 31, 2024

@BBC-Esq, looking forward it! 👍

If you'll recall, previously I helped troubleshoot the missing .dll issue and you gave me a link: #34 (comment)

Now that it has CUDA and Vulkan support can you help me get this up and running...here's where I'm stuck at.

  1. I created a virtual environment and activated it.

  2. I set the environment variable.

set WHISPER_CUDA=1
  1. Installed
pip install pywhispercpp

I ran the same script that I used previously...remember, the whisper.dll file needed to be within the site-packages directory. I'm on Windows, hence my different command to set the environmental variable.

Anyways, it ran but only on CPU. Am I correct in assuming that I need something other than whisper.dll now that I'm installing with the cuda flag? I see these on Whisper.cpp?

image

These are under their Whisper.cpp 1.6.0 release...

[EDIT]

I also tried this instead but it still didn't work:

pip install git+https://github.com/abdeladim-s/pywhispercpp

@abdeladim-s
Copy link
Owner

@BBC-Esq, of course.

  • To use CUDA you'll need to compile the package from source, the PYPI version is built using Github actions and it's CPU only. The steps are in the readme.

  • The issue with whisper.dll is now fixed I believe, so you don't need to copy anything!

If you follow the steps, it should be working basically, but let me know if you run into any issues.

@BBC-Esq
Copy link

BBC-Esq commented Aug 31, 2024

@BBC-Esq, of course.

* To use CUDA you'll need to compile the package from source, the PYPI version is built using Github actions and it's CPU only.  The steps are in the [readme](https://github.com/abdeladim-s/pywhispercpp#nvidia-gpu-support).

* The issue with whisper.dll is now fixed I believe,  so you don't need to copy anything!

If you follow the steps, it should be working basically, but let me know if you run into any issues.

I followed the installation instructions and ran my script, but I received this error:

Traceback (most recent call last):
  File "D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\bench_whispercpp.py", line 9, in <module>
    from pywhispercpp.model import Model
  File "D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\Lib\site-packages\pywhispercpp\__init__.py", line 11, in <module>
    os.add_dll_directory(os.path.join(os.path.dirname(__file__), 'lib'))
  File "<frozen os>", line 1119, in add_dll_directory
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'D:\\Scripts\\bench_whisper_github\\bench_whispercpp\\pywhispercpp\\Lib\\site-packages\\pywhispercpp\\lib'

To clarify, after using cd pywhispercpp and pip install ., I backed out one directory by using cd .. to where my script is located...then ran it:

python bench_whispercpp.py

At that point is when I received the error...

Here is the entire script I'm trying to run:

from pywhispercpp.model import Model

model = Model(r"D:\Scripts\benchmark_whisper\Models\ggml-base.en.bin", n_threads=14)
segments = model.transcribe(r"D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\test_small_audio.wav", speed_up=False)

for segment in segments:
    print(segment.text)

If it helps...I'm on Windows and use python -m venv . to create the virtual environment. Then I use .\Scripts\activate to activate it. After that is when I begin your installation instructions.

Not sure if it's an issue of putting the .dll and other files in the correct folder on a Windows system but...if this helps...beginning from the directory where my benchmarking script is located...you go into the Lib folder...then the site-packages folder...and that's where the pywhispercpp folder is located along with the other dependencies. Your instructions are geared towards Linux I believe, which I have no experience with...so just so you know the structure on Windows in case it's different somehow?

@abdeladim-s
Copy link
Owner

abdeladim-s commented Aug 31, 2024

So was the compilation successful at least ?
Yes, The instructions works on my Linux machine without any issues 😅

Could you please check if there is no lib folder in that path 'D:\\Scripts\\bench_whisper_github\\bench_whispercpp\\pywhispercpp\\Lib\\site-packages\\pywhispercpp\\lib' ?
Also inside the cloned source 'pywhispercpp/lib' ?

@BBC-Esq
Copy link

BBC-Esq commented Aug 31, 2024

THERE IS a "Lib" folder within that path...but my script is one directory higher at:

"D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\bench_whispercpp.py"

To clarify...here are the files I believe you're looking for:

"D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\pywhispercpp\pywhispercpp\lib\_pywhispercpp.cp311-win_amd64.pyd"

"D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\pywhispercpp\pywhispercpp\lib\ggml.dll"

"D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\pywhispercpp\pywhispercpp\lib\whisper.dll"

Hold tight...I'm going to give you the entire command prompt prints from start to finish...won't "cls" inbetween.

@BBC-Esq
Copy link

BBC-Esq commented Sep 1, 2024

Here's the entire command prompt. What's throwing me is the requirement to "cd" to a new directory after cloning. Do I have to move my benchmarking script to the cloned directory...intermixing it with all of this, which I was trying to avoid?
image

Here's the entire print:

PRINT DETAILS
D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp>python -m venv .

D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp>.\Scripts\activate

(pywhispercpp) D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp>git clone --recursive https://github.com/abdeladim-s/pywhispercpp
Cloning into 'pywhispercpp'...
remote: Enumerating objects: 753, done.
remote: Counting objects: 100% (181/181), done.
remote: Compressing objects: 100% (104/104), done.
remote: Total 753 (delta 99), reused 139 (delta 73), pack-reused 572 (from 1)
Receiving objects: 100% (753/753), 1.43 MiB | 12.92 MiB/s, done.
Resolving deltas: 100% (253/253), done.
Submodule 'whisper.cpp' (https://github.com/ggerganov/whisper.cpp.git) registered for path 'whisper.cpp'
Cloning into 'D:/Scripts/bench_whisper_github/bench_whispercpp/pywhispercpp/pywhispercpp/whisper.cpp'...
remote: Enumerating objects: 10690, done.
remote: Counting objects: 100% (3662/3662), done.
remote: Compressing objects: 100% (526/526), done.
remote: Total 10690 (delta 3294), reused 3215 (delta 3135), pack-reused 7028 (from 1)
Receiving objects: 100% (10690/10690), 14.97 MiB | 28.08 MiB/s, done.
Resolving deltas: 100% (7246/7246), done.
Submodule path 'whisper.cpp': checked out '9e3c5345cd46ea718209db53464e426c3fe7a25e'

(pywhispercpp) D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp>cd pywhispercpp

(pywhispercpp) D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\pywhispercpp>set WHISPER_CUDA=1

(pywhispercpp) D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\pywhispercpp>pip install .
Processing d:\scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\pywhispercpp
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting numpy (from pywhispercpp==1.2.0)
  Using cached numpy-2.1.0-cp311-cp311-win_amd64.whl.metadata (59 kB)
Collecting pydub (from pywhispercpp==1.2.0)
  Using cached pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting requests (from pywhispercpp==1.2.0)
  Using cached requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)
Collecting tqdm (from pywhispercpp==1.2.0)
  Using cached tqdm-4.66.5-py3-none-any.whl.metadata (57 kB)
Collecting platformdirs (from pywhispercpp==1.2.0)
  Using cached platformdirs-4.2.2-py3-none-any.whl.metadata (11 kB)
Collecting charset-normalizer<4,>=2 (from requests->pywhispercpp==1.2.0)
  Using cached charset_normalizer-3.3.2-cp311-cp311-win_amd64.whl.metadata (34 kB)
Collecting idna<4,>=2.5 (from requests->pywhispercpp==1.2.0)
  Using cached idna-3.8-py3-none-any.whl.metadata (9.9 kB)
Collecting urllib3<3,>=1.21.1 (from requests->pywhispercpp==1.2.0)
  Using cached urllib3-2.2.2-py3-none-any.whl.metadata (6.4 kB)
Collecting certifi>=2017.4.17 (from requests->pywhispercpp==1.2.0)
  Using cached certifi-2024.8.30-py3-none-any.whl.metadata (2.2 kB)
Collecting colorama (from tqdm->pywhispercpp==1.2.0)
  Using cached colorama-0.4.6-py2.py3-none-any.whl.metadata (17 kB)
Using cached numpy-2.1.0-cp311-cp311-win_amd64.whl (12.9 MB)
Using cached platformdirs-4.2.2-py3-none-any.whl (18 kB)
Using cached pydub-0.25.1-py2.py3-none-any.whl (32 kB)
Using cached requests-2.32.3-py3-none-any.whl (64 kB)
Using cached tqdm-4.66.5-py3-none-any.whl (78 kB)
Using cached certifi-2024.8.30-py3-none-any.whl (167 kB)
Using cached charset_normalizer-3.3.2-cp311-cp311-win_amd64.whl (99 kB)
Using cached idna-3.8-py3-none-any.whl (66 kB)
Using cached urllib3-2.2.2-py3-none-any.whl (121 kB)
Using cached colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Building wheels for collected packages: pywhispercpp
  Building wheel for pywhispercpp (pyproject.toml) ... done
  Created wheel for pywhispercpp: filename=pywhispercpp-1.2.0-cp311-cp311-win_amd64.whl size=121752 sha256=4a6eeffc7c42332227066155eedb215b194e635720f852388d4f7e7ce419ffa2
  Stored in directory: C:\Windows\Temp\pip-ephem-wheel-cache-96sxufgy\wheels\4f\6c\4e\0371a84c983074d706eaa9a6dd11dc195095ae1638625bb5ce
Successfully built pywhispercpp
Installing collected packages: pydub, urllib3, platformdirs, numpy, idna, colorama, charset-normalizer, certifi, tqdm, requests, pywhispercpp
Successfully installed certifi-2024.8.30 charset-normalizer-3.3.2 colorama-0.4.6 idna-3.8 numpy-2.1.0 platformdirs-4.2.2 pydub-0.25.1 pywhispercpp-1.2.0 requests-2.32.3 tqdm-4.66.5 urllib3-2.2.2

[notice] A new release of pip is available: 24.0 -> 24.2
[notice] To update, run: python.exe -m pip install --upgrade pip

(pywhispercpp) D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\pywhispercpp>python -m pip install --upgrade pip
Requirement already satisfied: pip in d:\scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\lib\site-packages (24.0)
Collecting pip
  Using cached pip-24.2-py3-none-any.whl.metadata (3.6 kB)
Using cached pip-24.2-py3-none-any.whl (1.8 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 24.0
    Uninstalling pip-24.0:
      Successfully uninstalled pip-24.0
Successfully installed pip-24.2

(pywhispercpp) D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\pywhispercpp>cd ..

(pywhispercpp) D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp>python bench_whispercpp.py
Traceback (most recent call last):
  File "D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\bench_whispercpp.py", line 1, in <module>
    from pywhispercpp.model import Model
  File "D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp\Lib\site-packages\pywhispercpp\__init__.py", line 11, in <module>
    os.add_dll_directory(os.path.join(os.path.dirname(__file__), 'lib'))
  File "<frozen os>", line 1119, in add_dll_directory
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'D:\\Scripts\\bench_whisper_github\\bench_whispercpp\\pywhispercpp\\Lib\\site-packages\\pywhispercpp\\lib'

I am only creating my virtual environment at "D:\Scripts\bench_whisper_github\bench_whispercpp\pywhispercpp"...but then your instructions lead me to then "cd" to the cloned directory...the build seems to occur successfully, but then the crucial files are not in the folder where the dependencies are for my specific virtual environment. I apologize, trying my best here.

@abdeladim-s
Copy link
Owner

abdeladim-s commented Sep 1, 2024

Thanks for the detailed answer. At least all the shared libraries seem to be there. But it's weird that it does not see the lib folder.
Even if you are in a different directory it should work, because the line that's causiing the issue:

os.add_dll_directory(os.path.join(os.path.dirname(__file__), 'lib'))

Will look into the actual directory of the __init__ file.

To avoid any path confusion, here are my suggestions:

  1. First, Create a activate a virtual environment, preferably give it a name different than pywhispercpp
  2. clone the repo into another folder, not inside the environment folder,
  3. compile and install same as you did before
  4. Now leave that folder and go to another folder, give it a name for example "bench_whispercpp" and put your script there.

Please follow these instructions again and let me know how it goes ?

@BBC-Esq
Copy link

BBC-Esq commented Sep 1, 2024

When you say "go to another folder"...what do you mean?

Here is where my virtual environment will be created:

"D:\Scripts\bench_whisper_github\bench_whispercpp"

This is where my bench_whispercpp.py script is by default.

Here is the "new" folder I created:

"D:\Scripts\bench_whisper_github\cloned_pywhispercpp"

After I build, the "new" folder has this folder:

"D:\Scripts\bench_whisper_github\cloned_pywhispercpp\pywhispercpp"

Within this folder there is yet another folder named pywhispercpp:

"D:\Scripts\bench_whisper_github\cloned_pywhispercpp\pywhispercpp\pywhispercpp"

Finally, within this folder are these files:

image

Now where am I supposed to place my bench_whispercpp.py script?

@abdeladim-s
Copy link
Owner

for example D:\Scripts\bench_whisper_github\my_scripts.

@BBC-Esq
Copy link

BBC-Esq commented Sep 1, 2024

I created that folder and moved bench_whispercpp.py there...and without creating yet another virtual environment simply ran the script. this is what I received:

D:\Scripts\bench_whisper_github\my_scripts>python bench_whispercpp.py
Traceback (most recent call last):
  File "D:\Scripts\bench_whisper_github\my_scripts\bench_whispercpp.py", line 1, in <module>
    from pywhispercpp.model import Model
  File "C:\Users\Airflow\AppData\Local\Programs\Python\Python311\Lib\site-packages\pywhispercpp\__init__.py", line 11, in <module>
    os.add_dll_directory(os.path.join(os.path.dirname(__file__), 'lib'))
  File "<frozen os>", line 1119, in add_dll_directory
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'C:\\Users\\Airflow\\AppData\\Local\\Programs\\Python\\Python311\\Lib\\site-packages\\pywhispercpp\\lib'

@abdeladim-s
Copy link
Owner

Still cannot find it !!
And the folder is there 'C:\\Users\\Airflow\\AppData\\Local\\Programs\\Python\\Python311\\Lib\\site-packages\\pywhispercpp\\lib' ?

@BBC-Esq
Copy link

BBC-Esq commented Sep 1, 2024

I also manually copied and pasted these files:

image

...to the site-packages folder where I created my virtual environment initially. In other words, rather than pip install pywhispercpp from my virtual environment, [EDIT] I took these two folders in their entirety.

image

Specifically, I pasted them to "D:\Scripts\bench_whisper_github\Lib\site-packages", which is within the dependency folder for my virtual environment...

Then when i ran the script and it gave me a different error this time:

Traceback (most recent call last):
  File "D:\Scripts\bench_whisper_github\bench_whispercpp\bench_whispercpp.py", line 1, in <module>
    from pywhispercpp.model import Model
  File "D:\Scripts\bench_whisper_github\bench_whispercpp\Lib\site-packages\pywhispercpp\model.py", line 13, in <module>
    import _pywhispercpp as pw
ModuleNotFoundError: No module named '_pywhispercpp'

@BBC-Esq
Copy link

BBC-Esq commented Sep 1, 2024

Still cannot find it !! And the folder is there 'C:\\Users\\Airflow\\AppData\\Local\\Programs\\Python\\Python311\\Lib\\site-packages\\pywhispercpp\\lib' ?

I'm trying to avoid installing it system-wide...Are you saying it's mandatory since I'm building it or something?

image

This is what that path currently shows...Maybe from a prior installation, I don't know, but I never meant to install pywhispercpp system-wide...

@abdeladim-s
Copy link
Owner

No, you don't need any manual copy if the wheel was successfully installed and the lib folder is there.
If the lib folder is there and the system cannot find it then I have no idea to be honest.

@abdeladim-s
Copy link
Owner

Still cannot find it !! And the folder is there 'C:\\Users\\Airflow\\AppData\\Local\\Programs\\Python\\Python311\\Lib\\site-packages\\pywhispercpp\\lib' ?

I'm trying to avoid installing it system-wide...Are you saying it's mandatory since I'm building it or something?

image

This is what that path currently shows...Maybe from a prior installation, I don't know, but I never meant to install pywhispercpp system-wide...

No, No need to install it system wide, an activated virtual env is better!

@BBC-Esq
Copy link

BBC-Esq commented Sep 1, 2024

How about a screenshare, if anything, just for the sake of saving time? Discord?

@abdeladim-s
Copy link
Owner

Yeah, that would save us a lot of time, but I am not home at the moment :(

@abdeladim-s
Copy link
Owner

abdeladim-s commented Sep 1, 2024

BTW, one more suggestion, can you build and share the wheel

Inside the cloned directory, run

python -m build --wheel 

you'll find the generated wheel inside the dist directory.

@BBC-Esq
Copy link

BBC-Esq commented Sep 1, 2024

What's your Discord name? Mine is Vic49. Honestly, I'm benching ctranslate2, transformers, tensorrt, etc. and I'd love to finally include a whispercpp implementation, but I have to allocate time to bench all backends. Thus, I'd suggest a Discord screenshare to save time on my end...otherwise, I'll have to wait until a cuda build is pip-installable...

But I saw your message and will try this last thing tonight. ;-)

@abdeladim-s
Copy link
Owner

not sure when I'll come back to be honest, I will try to contact you.

@BBC-Esq
Copy link

BBC-Esq commented Sep 1, 2024

You didn't tell me I had to pip install "build". Luckily Claude saved me. ;-) Seems to have successfully built.

removing build\bdist.win-amd64\wheel
Successfully built pywhispercpp-1.2.0-cp311-cp311-win_amd64.whl

The discord wouldn't be tonight...just whenever, you have my info.

Will try this wheel and let you know if that puts it to rest.

@BBC-Esq
Copy link

BBC-Esq commented Sep 1, 2024

I created the virtual environment and activated it...then simply pip install the wheel within the virtual environment. It successfully created this in my site-packages directory:

image

However, when I ran my script it only used CPU. Here's the script again for easy reference:

from pywhispercpp.model import Model

model = Model(r"D:\Scripts\benchmark_whisper\Models\ggml-base.en.bin", n_threads=14)
segments = model.transcribe(r"D:\Scripts\bench_whisper_github\bench_whispercpp\test_small_audio.wav")

for segment in segments:
    print(segment.text)

Hit me up on Discord when you want. Thanks.

@abdeladim-s
Copy link
Owner

abdeladim-s commented Sep 1, 2024

Finally worked 😮 Windows is just weird!


did you set the flag before building the wheel ?
If yes try with the new flag

GGML_CUDA=1

I think this one will work.

Anyways, here is my discord abdeladim_s, hit me up if you find any other issues.

@BBC-Esq
Copy link

BBC-Esq commented Sep 1, 2024

When you have time for a quick Discord screenshare let's try it. I'm in Eastern timezone (u.s.) just FYI.

@Yamakaze-chan
Copy link

Fucking A...it worked. I put it in the "site-packages" folder, the "benchmark_whisper" folder, and the "pywhispercpp" folder and it worked...now it's just a matter of narrowing it down to which folder hierarchy it should be in.

I put it in "site-packages" and It works

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants