Skip to content

Build linux CUDA releases suitable for Colab & other platforms on 12.2 #11226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 29 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
90a478b
Package linux cuda releases for various caps
Jan 8, 2025
4232406
Merge remote-tracking branch 'origin/master' into cuda-releases
Jan 14, 2025
41b4b11
Fix container tags + rename
Jan 14, 2025
66ec17e
Update build.yml
Jan 14, 2025
e3cf4dc
Merge remote-tracking branch 'origin/master' into cuda-releases
Jan 14, 2025
60151da
Use ccache in Docker CUDA build
Jan 19, 2025
a4aed1d
Update cuda.Dockerfile
Jan 19, 2025
67075cc
Merge branch 'ggerganov:master' into cuda-releases
ochafik Jan 20, 2025
5eb87e9
cuda builds: add libcurl
Jan 20, 2025
2984d3c
Align artefact names on existing ones
Jan 20, 2025
abd27fc
Update build.yml
ochafik Jan 20, 2025
b71c43c
Update build.yml
ochafik Jan 20, 2025
ac045e3
Update build.yml
ochafik Jan 20, 2025
22ed602
Temporarily upload artefacts in normal CI run to test artefacts
ochafik Jan 20, 2025
4165293
Attempt to fix weird git error by installing deps before clone
ochafik Jan 20, 2025
c92ae47
ci: attempt to fix safe directory issue
Jan 21, 2025
b7b264c
ci: setup ccache
Jan 21, 2025
7a5b18e
ditch ccache action + require cuda in release
Jan 21, 2025
f60e148
shuffle actions back to original order
Jan 21, 2025
3d63db2
Merge remote-tracking branch 'origin/master' into cuda-releases
Jan 21, 2025
614fd07
Merge remote-tracking branch 'origin/master' into cuda-releases
Jan 30, 2025
fa38b8e
Merge remote-tracking branch 'origin/master' into cuda-releases
Jan 31, 2025
1b8f9ca
minimize diff
Jan 31, 2025
89da8df
fix typo
Jan 31, 2025
ae175fe
ci + cuda: checkout w/ history when packaging needed
Jan 31, 2025
167c500
install zip in cuda container
Jan 31, 2025
178ad4e
Merge branch 'master' into cuda-releases
Feb 4, 2025
7341749
Merge remote-tracking branch 'origin/master' into cuda-releases
Feb 23, 2025
94f3218
Update build.yml
Feb 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions .devops/cuda.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,16 @@ FROM ${BASE_CUDA_DEV_CONTAINER} AS build
ARG CUDA_DOCKER_ARCH=default

RUN apt-get update && \
apt-get install -y build-essential cmake python3 python3-pip git libcurl4-openssl-dev libgomp1
apt-get install -y build-essential cmake python3 python3-pip git libcurl4-openssl-dev libgomp1 ccache

WORKDIR /app

COPY . .

RUN if [ "${CUDA_DOCKER_ARCH}" != "default" ]; then \
RUN --mount=type=cache,target=/root/.ccache \
--mount=type=cache,target=/var/lib/apt/lists \
--mount=type=cache,target=/var/cache/apt \
if [ "${CUDA_DOCKER_ARCH}" != "default" ]; then \
export CMAKE_ARGS="-DCMAKE_CUDA_ARCHITECTURES=${CUDA_DOCKER_ARCH}"; \
fi && \
cmake -B build -DGGML_NATIVE=OFF -DGGML_CUDA=ON -DLLAMA_CURL=ON ${CMAKE_ARGS} -DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined . && \
Expand Down
90 changes: 82 additions & 8 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -943,21 +943,65 @@ jobs:

ubuntu-latest-cmake-cuda:
runs-on: ubuntu-latest
container: nvidia/cuda:12.6.2-devel-ubuntu24.04

strategy:
matrix:
cuda:
# Colab and lightning.ai currently use CUDA 12.2 (test w/ `nvidia-smi | grep "CUDA Version: "`)
# Capabilities of GPUs are listed on https://developer.nvidia.com/cuda-gpus, can test w/ `nvidia-smi --query-gpu=compute_cap --format=csv`
# See available containers at https://hub.docker.com/r/nvidia/cuda/tags
- version: 12.2
container: nvidia/cuda:12.2.2-devel-ubuntu22.04
cap: 7.5
arch: 75-real
example: 'T4'
package: true
- version: 12.2
container: nvidia/cuda:12.2.2-devel-ubuntu22.04
cap: 8.0
arch: 80-real
example: 'A100'
package: true
- version: 12.2
container: nvidia/cuda:12.2.2-devel-ubuntu22.04
cap: 8.6
arch: 86-real
example: 'A10'
package: true
- version: 12.2
container: nvidia/cuda:12.2.2-devel-ubuntu22.04
cap: 8.9
arch: 89-real
example: 'L4, L40S'
- version: 12.2
container: nvidia/cuda:12.2.2-devel-ubuntu22.04
cap: 9.0
arch: 90-real
example: 'H100'
package: true
# Build only, don't package.
- version: 12.6
container: nvidia/cuda:12.6.2-devel-ubuntu22.04
cap: 8.9
arch: 89-real
package: false

container: ${{ matrix.cuda.container }}

name: ubuntu-22-cuda (${{ matrix.cuda.version }} cap ${{ matrix.cuda.cap }}, e.g. ${{ matrix.cuda.example }})

steps:
- name: Clone
id: checkout
uses: actions/checkout@v4
- uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Install dependencies
- name: Dependencies
id: depends
env:
DEBIAN_FRONTEND: noninteractive
run: |
apt update
apt install -y cmake build-essential ninja-build libgomp1 git
apt-get update
apt install -y cmake build-essential ninja-build libcurl4-openssl-dev libgomp1 git zip

- name: ccache
uses: hendrikmuhs/[email protected]
Expand All @@ -969,13 +1013,42 @@ jobs:
run: |
cmake -S . -B build -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CUDA_ARCHITECTURES=89-real \
-DCMAKE_CUDA_ARCHITECTURES=${{ matrix.cuda.arch }} \
-DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined \
-DLLAMA_FATAL_WARNINGS=ON \
-DGGML_NATIVE=OFF \
-DGGML_CUDA=ON
cmake --build build

- name: Determine tag name
if: ${{ matrix.cuda.package }}
id: tag
shell: bash
run: |
BUILD_NUMBER="$(git rev-list --count HEAD)"
SHORT_HASH="$(git rev-parse --short=7 HEAD)"
if [[ "${{ env.BRANCH_NAME }}" == "master" ]]; then
echo "name=b${BUILD_NUMBER}" >> $GITHUB_OUTPUT
else
SAFE_NAME=$(echo "${{ env.BRANCH_NAME }}" | tr '/' '-')
echo "name=${SAFE_NAME}-b${BUILD_NUMBER}-${SHORT_HASH}" >> $GITHUB_OUTPUT
fi
echo "cuda_name=cu${{ matrix.cuda.short_version }}-cap${{ matrix.cuda.cap }}" >> $GITHUB_OUTPUT

- name: Pack artifacts
id: pack_artifacts
# if: ${{ matrix.cuda.package && (( github.event_name == 'push' && github.ref == 'refs/heads/master' ) || github.event.inputs.create_release == 'true') }}
run: |
cp LICENSE ./build/bin/
zip -r llama-${{ steps.tag.outputs.name }}-bin-ubuntu-cuda-${{ steps.tag.outputs.cuda_name }}-x64.zip ./build/bin/*

- name: Upload artifacts
# if: ${{ matrix.cuda.package && (( github.event_name == 'push' && github.ref == 'refs/heads/master' ) || github.event.inputs.create_release == 'true') }}
uses: actions/upload-artifact@v4
with:
path: llama-${{ steps.tag.outputs.name }}-bin-ubuntu-cuda-${{ steps.tag.outputs.cuda_name }}-x64.zip
name: llama-bin-ubuntu-cuda-${{ steps.tag.outputs.cuda_name }}-x64.zip

windows-2019-cmake-cuda:
runs-on: windows-2019

Expand Down Expand Up @@ -1383,6 +1456,7 @@ jobs:

needs:
- ubuntu-cpu-cmake
- ubuntu-latest-cmake-cuda
- ubuntu-22-cmake-vulkan
- windows-latest-cmake
- windows-2019-cmake-cuda
Expand Down
1 change: 1 addition & 0 deletions ggml/src/kompute
Submodule kompute added at 456519
Loading