09 Nov 02:54

30b773d

llama.cpp b6992 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b6992
Commit: aa3b7a90b407c556778a7e13a4b0d28cf964fd1c

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b6992-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

08 Nov 02:48

github-actions

b6980

30b773d

llama.cpp b6980 with CUDA

llama.cpp b6980 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b6980
Commit: 299f5d782c8ffd7195a1ed6a6d5561f759beb07e

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b6980-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

07 Nov 02:48

github-actions

b6970

30b773d

llama.cpp b6970 with CUDA

llama.cpp b6970 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b6970
Commit: 7f09a680af6e0ef612de81018e1d19c19b8651e8

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b6970-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

06 Nov 02:48

github-actions

b6962

30b773d

llama.cpp b6962 with CUDA

llama.cpp b6962 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b6962
Commit: 230d1169e5bfe04a013b2e20f4662ee56c2454b0

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b6962-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

05 Nov 02:52

github-actions

b6949

30b773d

llama.cpp b6949 with CUDA

llama.cpp b6949 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b6949
Commit: a5c07dcd7b49916c7c770f2da9583e6b82717678

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b6949-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

04 Nov 02:51

github-actions

b6940

30b773d

llama.cpp b6940 with CUDA

llama.cpp b6940 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b6940
Commit: c5023daf607c578d6344c628eb7da18ac3d92d32

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b6940-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

03 Nov 03:01

github-actions

b6929

30b773d

llama.cpp b6929 with CUDA

llama.cpp b6929 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b6929
Commit: a2054e3a8ff0da3978a4acc18c349ff58554d336

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b6929-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

02 Nov 02:52

github-actions

b6920

30b773d

llama.cpp b6920 with CUDA

llama.cpp b6920 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b6920
Commit: d38d9f0877a5872daa3c5f06fb9a86376bf15d50

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b6920-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

01 Nov 02:53

github-actions

b6907

30b773d

llama.cpp b6907 with CUDA

llama.cpp b6907 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b6907
Commit: bea04522ff1a0d8559ccfd353aa018dcfbb608cc

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b6907-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

31 Oct 02:44

github-actions

b6891

30b773d

llama.cpp b6891 with CUDA

llama.cpp b6891 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b6891
Commit: 16724b5b6836a2d4b8936a5824d2ff27c52b4517

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b6891-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

Releases: ai-dock/llama.cpp-cuda