Repository for building ML images at CoreWeave
See the list of all published images.
Special PyTorch Images:
CoreWeave provides custom builds of
PyTorch,
torchvision
and torchaudio
tuned for our platform in a single container image, ml-containers/torch
.
Versions compiled against CUDA 11.8.0, 12.0.1, 12.1.1, and 12.2.2 are available in this repository, with two variants:
base
: Tagged asml-containers/torch:a1b2c3d-base-...
.- Built from
nvidia/cuda:...-base-ubuntu22.04
as a base. - Only includes essentials (CUDA,
torch
,torchvision
,torchaudio
), so it has a small image size, making it fast to launch.
- Built from
nccl
: Tagged asml-containers/torch:a1b2c3d-nccl-...
.- Built from
ghcr.io/coreweave/nccl-tests
as a base. - Ultimately inherits from
nvidia/cuda:...-cudnn8-devel-ubuntu22.04
. - Larger, but includes development libraries and build tools such as
nvcc
necessary for compiling other PyTorch extensions. - These PyTorch builds are built on component libraries optimized for the CoreWeave cloud—see
coreweave/nccl-tests
.
- Built from
Note
Most torch
images have both a variant built on Ubuntu 22.04 and a variant built on Ubuntu 20.04.
- CUDA 11.8.0 is an exception, and is only available on Ubuntu 20.04.
- Ubuntu 22.04 images use Python 3.10.
- Ubuntu 20.04 images use Python 3.8.
- The base distribution is indicated in the container image tag.
ml-containers/torch-extras
extends the ml-containers/torch
images with a set of common PyTorch extensions:
Each one is compiled specially against the custom PyTorch builds in ml-containers/torch
.
Both base
and nccl
editions are available for
ml-containers/torch-extras
matching those for
ml-containers/torch
.
The base
edition retains a small size, as a multi-stage build is used to avoid including
CUDA development libraries in it, despite those libraries being required to build
the extensions themselves.
ml-containers/nightly-torch
is an experimental, nightly release channel of the
PyTorch Base Images in the style of PyTorch's
own nightly preview builds, featuring the latest development versions of
torch
, torchvision
, and torchaudio
pulled daily from GitHub
and compiled from source.
ml-containers/nightly-torch-extras
is a version of PyTorch Extras built on top of the
ml-containers/nightly-torch
container images.
These are not nightly versions of the extensions themselves, but rather match
the extension versions in the regular PyTorch Extras containers.
⚠ The PyTorch Nightly containers are based on unstable, experimental preview builds of PyTorch, and should be expected to contain bugs and other issues. For more stable containers use the PyTorch Base Images and PyTorch Extras containers.
This repository contains multiple container image Dockerfiles, each is expected to be within its own folder along with any other needed files for the build.
The current CI builds are set up to run when changes to files in the respective folders are detected so that only the changed container images are built. The actions are set up with an action per image utilizing a reusable base action build.yml. The reusable action accepts several inputs:
folder
- the folder containing the dockerfile for the imageimage-name
- the name to use for the imagebuild-args
- arguments to pass to the docker build
Images built using the same source can utilize one action as the main reason for the multiple actions is to handle only building the changed images. A build matrix can be helpful for these cases https://docs.github.com/en/actions/using-jobs/using-a-matrix-for-your-jobs.