Description
Please write an instruction how to make CUBLAS and CLBLAST builds on Windows. I have spent like half of the day without any success. My current attempt for CUBLAS is the following bat file:
SET CUDAFLAGS="-arch=all -lcublas" && SET LLAMA_CUBLAS=1 && SET CMAKE_ARGS="-DLLAMA_CUBLAS=on" && SET FORCE_CMAKE=1 && pip install llama-cpp-python[server] --force-reinstall --upgrade --no-cache-dir
pause
pip uninstall pydantic
pip install "pydantic==1.*"
And for CLBLAST:
SET LLAMA_CLBLAST=1 && SET CMAKE_ARGS="-DLLAMA_CLBLAST=on" && SET FORCE_CMAKE=1 && pip install llama-cpp-python[server] --force-reinstall --upgrade --no-cache-dir
pause
pip uninstall pydantic
pip install "pydantic==1.*"
Somehow it doesn't like pydantic v2.* and I had to downgrade it.
Neither of them seem to work. When I run
python -m llama_cpp.server --model c:\ai\llama\Wizard-Vicuna-13B-Uncensored.ggmlv3.q5_K_M.bin --n_gpu_layers 100 --use_mmap 0
All layers are loaded in to the RAM.