-
Notifications
You must be signed in to change notification settings - Fork 229
Replies: 2 comments · 6 replies
-
Can you try running the following command and pasting the full output?
|
Beta Was this translation helpful? Give feedback.
All reactions
-
The issue still persists unfortunately. The real annoying part is that uppon inspection in the modules I find the import code, which should be working after pip install autoawq-kernels
|
Beta Was this translation helpful? Give feedback.
All reactions
-
How long did you let it hang? Did you check if it loaded onto your GPU? Do you have WSL2 installed? |
Beta Was this translation helpful? Give feedback.
All reactions
-
I do not have the WSL2, that must be it! I let it hang for over an hour or so before writing here, how long is it supposed to run by estimate? |
Beta Was this translation helpful? Give feedback.
All reactions
-
For this model, it should be 7-20 minutes dependent on how performant the setup is |
Beta Was this translation helpful? Give feedback.
All reactions
-
Hey Casper, sorry for the noob questions ahead. I have managed to install the WSL2 Ubuntu,
I have managed to run the python setup.py install with the results as above Before that I got the proper cuda compatible torch version alongside cuda for the 11.5 version as shown below
But when I try to run the quantization code it doesn't find the awq module
output
I am really stuck on this since yesterday and I feel like I am just inches away. Are you maybe aware of what is the issue here? |
Beta Was this translation helpful? Give feedback.
All reactions
-
Hi @Endote, it looks like you installed the wrong kernels. You need to install https://github.com/casper-hansen/AutoAWQ_kernels |
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
I have changed the CUDA to 12.1 and installed the autoawq-kernels from repo
Am I supposed to run the setup install script? If so some of the files in the script do not have the paths adjusted and I am stuck now on one file that is totally missing Can you please help me understand next steps properly? |
Beta Was this translation helpful? Give feedback.
-
I am running into an issue while quantizing llama3 8b instruct both safetensors and gguf.
this is what I am getting
and it loads indefinitely, no estimation, no nothing
The code I have been using
For further reference I am using Windows 10 machine with nvidia 4080 and 64GB of ram.
Please help me find out what is wrong, otherwise I will have to look for other quantization methods :/
Beta Was this translation helpful? Give feedback.
All reactions