Update to more recent ggml format

[I put your changes of llama.cpp into the most recent llama.cpp](https://github.com/Phylliida/llama.cpp)

Then I had to modify LLaMPPL/llamppl/llama_cpp.py to use the new code from llama_cpp_python, you can see the new file [here](https://github.com/Phylliida/LLaMPPL/blob/main/llamppl/llama_cpp.py)

Probably the easier change on your end is to pull changes of your llama_cpp branch from main and edit llama_cpp.py, but here's these if needed

Edit: hmm I'm having this issue with my changes when I offload to gpu, hold on lemme look into it:

GGML_ASSERT: C:\\...\llama-cpp-python\vendor\llama.cpp\ggml.c:15154: tensor->src0->backend == GGML_BACKEND_CPU

Edit edit: nvm those are just bc eval_multi doesn't have gpu support yet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to more recent ggml format #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Update to more recent ggml format #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions