-
Notifications
You must be signed in to change notification settings - Fork 11.7k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
scripts : support arbitrary input file formats in compare-llama-bench.py
python
python script changes
script
Script related
#13455
opened May 11, 2025 by
CISC
Loading…
CUDA: faster Deepseek FA, add Turing support
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#13435
opened May 10, 2025 by
JohannesGaessler
Loading…
Break down main function in llama-server
examples
server
#13425
opened May 10, 2025 by
ericcurtin
Loading…
llama-bench : accept ranges for integer parameters
examples
#13410
opened May 9, 2025 by
slaren
Loading…
sycl: enable dpcpp nightly builds with oneMKL and oneDNN
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#13406
opened May 9, 2025 by
AD2605
Loading…
Update README.md for using llama.cpp in Microsoft Word locally
#13401
opened May 9, 2025 by
GPTLocalhost
Loading…
grammar: handle misplaced special regex chars [*+?]
#13391
opened May 8, 2025 by
rick-github
Loading…
sycl: simplify bin_bcast_kernel
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#13383
opened May 8, 2025 by
AD2605
Loading…
musa: restore MUSA graph settings in CMakeLists.txt
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#13382
opened May 8, 2025 by
yeahdongcn
•
Draft
gguf-py: Optimize python script changes
GGUFReader
read-only mode performance
python
#13378
opened May 8, 2025 by
Isotr0py
Loading…
CUDA: update build CTK version to 12.8
devops
improvements to build systems and github actions
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#13360
opened May 7, 2025 by
thevishalagarwal
Loading…
python : bump transformers version
python
python script changes
#13351
opened May 7, 2025 by
ngxson
Loading…
Add mistral-chat-7b preset for llama-server
examples
#13348
opened May 7, 2025 by
vahedshaik
Loading…
llama: move page cache via mbind to prevent cross-NUMA access
#13335
opened May 6, 2025 by
vishalc-ibm
Loading…
add AMD Genoa
ggml
changes relating to the ggml tensor library for machine learning
#13334
opened May 6, 2025 by
QuPengfei
Loading…
[Perf] [CPU] eliminate redundant memory access in group query attention
ggml
changes relating to the ggml tensor library for machine learning
#13319
opened May 5, 2025 by
ZelinMa557
Loading…
Added dynamic context size. This is perfect for servers running llama models as a service.
#13295
opened May 4, 2025 by
J4e6eR
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.