-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Description
I put your changes of llama.cpp into the most recent llama.cpp
Then I had to modify LLaMPPL/llamppl/llama_cpp.py to use the new code from llama_cpp_python, you can see the new file here
Probably the easier change on your end is to pull changes of your llama_cpp branch from main and edit llama_cpp.py, but here's these if needed
Edit: hmm I'm having this issue with my changes when I offload to gpu, hold on lemme look into it:
GGML_ASSERT: C:\...\llama-cpp-python\vendor\llama.cpp\ggml.c:15154: tensor->src0->backend == GGML_BACKEND_CPU
Edit edit: nvm those are just bc eval_multi doesn't have gpu support yet
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels