29 Nov 02:44

30b773d

llama.cpp b7192 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b7192
Commit: 03914c7ef826caf0b6371a6d1de270cda102b542

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b7192-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

28 Nov 02:45

github-actions

b7180

30b773d

llama.cpp b7180 with CUDA

llama.cpp b7180 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b7180
Commit: efaaccdd69cd9db777584c2a062f70c0526a6fb5

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b7180-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

27 Nov 02:46

github-actions

b7170

30b773d

llama.cpp b7170 with CUDA

llama.cpp b7170 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b7170
Commit: e509411cf142807c947b53b340d2d5594ce38120

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b7170-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

26 Nov 02:50

github-actions

b7157

30b773d

llama.cpp b7157 with CUDA

llama.cpp b7157 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b7157
Commit: 583cb83416467e8abf9b37349dcf1f6a0083745a

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b7157-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

25 Nov 02:49

github-actions

b7150

30b773d

llama.cpp b7150 with CUDA

llama.cpp b7150 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b7150
Commit: 3d07caa99bff9213411202b4063aa2f44e919654

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b7150-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

24 Nov 02:55

github-actions

b7137

30b773d

llama.cpp b7137 with CUDA

llama.cpp b7137 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b7137
Commit: fcb013847c2c983967e9d8c9a13b16829fb799e6

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b7137-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

23 Nov 03:00

github-actions

b7130

30b773d

llama.cpp b7130 with CUDA

llama.cpp b7130 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b7130
Commit: 3f3a4fb9c3b907c68598363b204e6f58f4757c8c

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b7130-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

22 Nov 02:47

github-actions

b7129

30b773d

llama.cpp b7129 with CUDA

llama.cpp b7129 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b7129
Commit: 028f93ef9819d1a039f97d74d72380c986cd69aa

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b7129-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

21 Nov 02:47

github-actions

b7122

30b773d

llama.cpp b7122 with CUDA

llama.cpp b7122 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b7122
Commit: 21d31e0810d398f75ddd7d7c4cec9907a5576f26

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b7122-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

20 Nov 02:45

github-actions

b7108

30b773d

llama.cpp b7108 with CUDA

llama.cpp b7108 with CUDA Support

Pre-built binaries of llama.cpp with CUDA support for multiple CUDA versions.

Source: https://github.com/ggml-org/llama.cpp/releases/tag/b7108
Commit: 7d77f07325985c03a91fa371d0a68ef88a91ec7f

CUDA Versions

CUDA 12.4 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.6 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0
CUDA 12.8 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 12.9 - Architectures: 6.1, 7.0, 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0
CUDA 13.0 - Architectures: 7.5, 8.0, 8.6, 8.9, 9.0, 10.0, 12.0

Architecture Reference

6.1: Titan XP, Tesla P40, GTX 10xx
7.0: Tesla V100
7.5: Tesla T4, RTX 20xx series, Quadro RTX
8.0: A100
8.6: RTX 3000 series
8.9: RTX 4000 series, L4, L40
9.0: H100, H200
10.0: B200
12.0: RTX Pro series, RTX 50xx

Usage

Download the appropriate tarball for your CUDA version and extract:

tar -xzf llama.cpp-b7108-cuda-12.8.tar.gz
./llama-cli --help

Assets 7

Releases: ai-dock/llama.cpp-cuda