Latest release: 20241219
Dockerfile
s to build containers with support for CPU and GPU (NVIDIA CUDA) containers with support for TensorFlow, PyTorch and OpenCV (or combinations of), based on nvidia/cuda
and Ubuntu 22.04 container images.
The tool's purpose is to enable developers, ML and CV enthusiasts to build and test solutions FROM
a docker container, allowing fast prototyping and release of code to the community.
The CTPO (CUDA + TensorFlow + PyTorch + OpenCV) project aims to address the following challenges and provide solutions:
- Containerized Development Environment: CTPO offers Docker containers with pre-configured environments containing CUDA, TensorFlow, PyTorch, and OpenCV. This allows developers to work within a consistent and isolated environment.
- Fast Prototyping and Testing: The project facilitates fast prototyping and testing by providing pre-built containers. Developers can quickly iterate on their code within the containerized environment.
- Versioned Frameworks and Dependencies: The project uses versioned Docker containers, making it easier for developers to work with specific versions of TensorFlow, PyTorch, OpenCV, and other components.
- Jupyter Lab Integration: CTPO includes Jupyter Lab builds, allowing developers to use a web-based interface for interactive development, visualization, and documentation.
Building each container independently is made possible by the Dockerfile
available in the BuildDetails/<release>/<container>-<tag>
directories.
Building each container takes resources and time (counted in many cores, GB of memory and build hours).
Pre-built containers are available from Infotrend Inc.'s Docker account at https://hub.docker.com/r/infotrend/ Details on the available container and build are discussed in this document.
A Jupyter Lab and Unraid version of this WebUI-enabled version are also available on our Docker Hub, as well as able to be built from the Makefile
.
Note: this tool was built earlier in 2023, iterations of its Jupyter Lab were made available to Infotrend's data scientists, and we are releasing it to help the developer community.
The base for those container images is pulled from Dockerhub's official ubuntu:22.04
or nvidia/cuda:[...]-devel-ubuntu22.04
images.
More details on the Nvidia base images are available at https://hub.docker.com/r/nvidia/cuda/ In particular, please note that "By downloading these images, you agree to the terms of the license agreements for NVIDIA software included in the images"; with further details on DockerHub version from https://docs.nvidia.com/cuda/eula/index.html#attachment-a
For GPU-optimized versions, we will need to build the cuda_
versions on a host with the supported hardware.
When using GPU and building the container, we need to install the NVIDIA Container Toolkit found at https://github.com/NVIDIA/nvidia-container-toolkit
Note that the NVIDIA video driver on the Linux host needs to support the version of CUDA that we are trying to build (we can see the supported CUDA version and driver version information when running the nvidia-smi
command)
For CPU builds, simply build the non-cuda_
versions.
Pre-built images are available for download on Infotrend's DockerHub (at https://hub.docker.com/r/infotrend/).
Those are built using the same method provided by the Makefile
and the corresponding Dockerfile
used for those builds is stored in the matching BuildDetails/<release>/<container>-<tag>
directory.
The tag naming convention follows a _
-components split after the base name of infotrend/ctpo-
followed by the "release" tag (Docker container images are always lowercase).
-
is used as a feature separator, in particular for jupyter
or unraid
specific builds.
Any cuda_
build is a GPU
build while all non-cuda_
ones are CPU
only.
For example, for infotrend/ctpo-tensorflow_pytorch_opencv:2.12.0_2.0.1_4.7.0-20231120
, this means: "base name"-"component1"_"compoment2"_"component3":"component1_version"_"component2_version"_"component3_version"-"release"
with:
base name
=infotrend/ctpo-
component1
+component1_version
=tensorflow
2.12.0
component2
+component2_version
=pytorch
2.0.1
component3
+component3_version
=opencv
4.7.0
release
=20231120
As such, this was "Infotrend's CTPO release 20231120 with TensorFlow 2.12.0, PyTorch 2.0.1, and OpenCV 4.7.0 without any CUDA support." (Since nocuda_
was part of the name, this is aCPU
build)
Similarly, infotrend/ctpo-jupyter-cuda_tensorflow_pytorch_opencv-unraid:11.8.0_2.12.0_2.0.1_4.7.0-20231120
can be read as:
base name
=infotrend/ctpo-
feature1
=jupyter
component1
+component1_version
=cuda
11.8.0
component2
+component2_version
=tensorflow
2.12.0
component3
+component3_version
=pytorch
2.0.1
component4
+component4_version
=opencv
4.7.0
feature2
=unraid
release
=20231120
"Infotrend's CTPO release 20231120 with a Jupyter Lab and Unraid specific components with PyTorch 2.0.1, OpenCV 4.7.0 and GPU (CUDA) support."
There will be a variable number of components or features in the full container name as shown above. It is left to the end user to follow the naming convention.
There are two methods to build local containers:
- when building multiple versions of the container or to have a dedicated Docker
buildx
builder used; we recommend using theMakefile
- when building a specific version of the container; use the matching
BuildDetails/<RELEASE>/<COMPONENTS>-<COMPONENTS_VERSIONS>-<RELEASE>/Dockerfile
When using the Makefile
method: we will use the make
command to generate the content of the BuildDetails
directory which will place all files required for building the container within that directory. For a 20241219
release, for pytorch_opencv
a BuildDetails/20241219/pytorch_opencv-2.5.1_4.10.0-20241219
directory (following the naming convention) is created and contains the build artifacts including a Dockerfile
When using the BuildDetails/<RELEASE>/<COMPONENTS>-<COMPONENTS_VERSIONS>-<RELEASE>/Dockerfile
method: we can build the container using the files within that directory.
Building the containers requires the docker build step to have internet access. The process is CPU and Memory intensive and will require storage space for the build and the final image.
For reference, on a system with an AMD 5950x (16-cores, ie NUMPROC=32
), 128GB of memory and NVMe-based storage, the build time of make build_tpo build_ctpo build_ctpo_tensort
(for the 20241219
release) --ie making use of layers caching when possible-- was about 60 minutes for the tensorflow_opencv
or pytorch_opencv
containers (about 7GB each), about 80 minutes for tensorflow_pytorch_opencv
(about 8GB), close to 130 minutes (and a 17GB container) for cuda_tensorflow_opencv
, over 200 minutes (also 17GB) for cuda_pytorch_opencv
, over 220 minutes (for a 19GB container) for cuda_tensorflow_pytorch_opencv
. The TensorRT
variant took 240 minutes to build and its image size is close to 30GB.
When generating all images, there is about 100GB of reclaimable space generated by Docker buildx
.
This method is a per-release method, as it will generate a build specific Dockerfile
Type make
to get the list of targets and some details of the possible builds.
It is possible to adapt the content of the Makefile
to build custom solutions. For example, the default is not to build OpenCV non-free or build FFmpeg with libnpp, as those would make the images unredistributable and any release on our Dockerhub is made with "redistributable" packages.
We will see the result of this command for the 20241219
release:
**** Docker Image tag ending: 20241219
**** Docker Runtime: GPU
To switch between GPU/CPU: add/remove "default-runtime": "nvidia" in /etc/docker/daemon.json then run: sudo systemctl restart docker
*** Available Docker images to be built (make targets):
build_tpo (requires CPU Docker runtime):
tensorflow_opencv OR pytorch_opencv OR tensorflow_pytorch_opencv (aka TPO, for CPU):
tensorflow_opencv-2.18.0_4.10.0
pytorch_opencv-2.5.1_4.10.0
tensorflow_pytorch_opencv-2.18.0_2.5.1_4.10.0
build_ctpo (requires GPU Docker runtime):
cuda_tensorflow_opencv OR cuda_pytorch_opencv OR cuda_tensorflow_pytorch_opencv (aka CTPO, for NVIDIA GPU):
cuda_tensorflow_opencv-12.5.1_2.18.0_4.10.0
cuda_pytorch_opencv-12.5.1_2.5.1_4.10.0
cuda_tensorflow_pytorch_opencv-12.5.1_2.18.0_2.5.1_4.10.0
build_ctpo_tensorrt (requires GPU Docker runtime):
cuda_tensorflow_pytorch_opencv (aka CTPO, for NVIDIA GPU with TensorRT): same as cuda_tensorflow_pytorch_opencv but installing TensorRT libraries
Note:TensorRT is not supported by TensorFlow since 2.18.0
*** Jupyter Labs ready containers (requires the base TPO & CTPO container to either be built locally or docker will attempt to pull otherwise)
jupyter_tpo:
jupyter-tensorflow_pytorch_opencv-2.18.0_2.5.1_4.10.0
jupyter_ctpo:
jupyter-cuda_tensorflow_pytorch_opencv-12.5.1_2.18.0_2.5.1_4.10.0
In this usage are multiple sections:
- The
Docker Image tag ending
matches the software release tag. - The
Docker Runtime
explains the current default runtime. ForGPU
(CTPO) builds it is recommended to add"default-runtime": "nvidia"
in the/etc/docker/daemon.json
file and restart the docker daemon. Similarly, forCPU
(TPO) builds, it is recommended that the"default-runtime"
should be removed (or commented,) but because switching runtime on a system is not always achievable, we will useNVIDIA_VISIBLE_DEVICES=void
(details.) We can check the current status of the docker runtime by running:docker info | grep "Default Runtime"
- The
Available Docker images to be built
section allows us to select the possible build targets. ForGPU
, thecuda_
variants. ForCPU
the noncuda_
variants. Naming conventions and tags follow the guidelines specified in the "Tag naming conventions" section. - The
Jupyter Labs ready containers
are based on the containers built in the "Available Docker images[...]" and adding a running "Jupyter Labs" following the specificDockerfile
in theJupyter_build
directory. The list of built containers is limited to the most components perCPU
andGPU
to simplify distribution.
Note: Local builds will not have the infotrend/ctpo-
added to their base name as those are only for release to Docker hub by maintainers.
Each time we request a specific make
target, a dedicated Dockerfile
is built in the BuildDetails/<release>/<target>
directory.
That Dockerfile
contains ARG
and ENV
values that match the specific build parameters.
For example in release 20241219
, when building the tensorflow_opencv
target, the BuildDetails/20241219/tensorflow_opencv-2.18.0_4.10.0-20241219/Dockerfile
is created and used to build the tensorflow_opencv:2.18.0_4.10.0-20241219
container image.
In that file, we will see content such as:
ARG CTPO_FROM=ubuntu:24.04
FROM ${CTPO_FROM}
[...]
## Download & Building TensorFlow from source in same RUN
ENV LATEST_BAZELISK=1.22.1
ENV CTPO_TENSORFLOW_VERSION=2.18.0
ENV CTPO_TF_CONFIG=""
ENV TF_CUDA_COMPUTE_CAPABILITIES=""
[...]
# No Magma (PyTorch GPU only)
# No PyTorch, Torch Audio or Torch Video
[...]
, which is specific to the CPU build of TensorFlow and OpenCV (without PyTorch).
This Dockerfile
can be used directly by developers to build and integrate their modifications to build a specific feature.
When the maintainers upload this image to Dockerhub, that image will be preceded by infotrend/ctpo-
.
When choosing to build a container image on specific hardware, please be patient, building any of those images might take a long time (counted in hours).
To build it this way, find the corresponding Dockerfile
and in the directory where the file is located: docker build -f Dockerfile --tag custombuild:local .
For example, to build the BuildDetails/20241219/tensorflow_opencv-2.18.0_4.10.0-20241219/Dockerfile
and tag it as to:test
, run:
% cd BuildDetails/20241219/tensorflow_opencv-2.18.0_4.10.0-20241219
% docker build -f Dockerfile --tag to:test .
ℹ️ When using an existing
Dockerfile
, please update theARG CTPO_NUMPROC=
line with the value of running thenproc --all
command. The value in theDockerfile
reflects the build as it was performed for release to Docker Hub and might not represent the build system.
The Makefile
contains most of the variables that define the versions of the different frameworks.
The file has many comments that allow developers to tailor the build.
For example, any release on our Dockerhub is made with "redistributable" packages, the CTPO_ENABLE_NONFREE
variable in the Makefile
controls that feature:
The default is not to build OpenCV non-free or build FFmpeg with libnpp, as those would make the images unredistributable.
Replace "free" by "unredistributable" if you need to use those for a personal build
The Dockerfile
used for a Dockerhub pushed built is shared in the BuildDetails
directory (see the Dockerfile section above)
We will publish releases into Infotrend Inc's Docker Hub account. There you can find other releases from Infotrend.
The tag naming reflects the Tag naming conventions section above.
latest
is used to point to the most recent release for a given container image.
The different base container images that can be found there are:
- CPU builds:
- GPU builds:
- Jupyter Lab CPU & GPU builds:
- Unraid enabled Jupyter Lab, CPU & GPU builds:
Note that we are not releasing TensorRT based containers on DockerHub. We provide a Dockerfile
in the corresponding BuildDetails
(ending in -TensoRT
) for local builds, follow the "Building (using Dockerfile)" instructions to build the container.
We can check the TensorRT Python version present in the locally built container by:
% docker run --gpus all --rm -it cuda_tensorflow_pytorch_opencv:12.5.1_2.18.0_2.5.1_4.10.0-20241219-TensorRT
# python3
>>> import tensorrt
>>> print(tensorrt.__version__)
10.7.0
The README-BuildDetails.md
file is built automatically from the content of the BuildDetails
directory and contains links to different files stored in each sub-directory.
It reflects each build's detailed information, such as (where relevant) the Docker tag, version of CUDA, cuDNN, TensorFlow, PyTorch, OpenCV, FFmpeg and Ubuntu. Most content also links to sub-files that contain further insight into the system package, enabled build parameters, etc.
Jupyter Lab containers are built FROM
the tensorflow_pytorch_opencv
or cuda_tensorflow_pytorch_opencv
containers.
A "user" version (current user's UID and GID are passed to the internal user) can be built using make JN_MODE="-user" jupyter_tpo jupyter_ctpo
.
The specific details of such builds are available in the Jupyter_build
directory, in the Dockerfile
and Dockerfile-user
files.
The default Jupyter Lab's password (iti
) is stored in the Dockerfile
and can be modified by the builder by replacing the --IdentityProvider.token='iti'
command line option.
When using the Jupyter-specific container, it is important to remember to expose the port used by the tool (here: 8888), as such in the docker run
command, make sure to add -p 8888:8888
to the command line.
Pre-built containers are available, see the Available builds on DockerHub section above.
Those are specializations of the Jupyter Lab's builds, and container images with a sudo
-capable jupyter
user using Unraid's specific uid
and gid
and the same default iti
Jupyter Lab's default password.
The Unraid version can be built using make JN_MODE="-unraid" jupyter_tpo jupyter_ctpo
.
The build Dockerfile
is Jupyter_build/Dockerfile-unraid
.
Pre-built containers are available, see the Available builds on DockerHub section above.
A minimum Nvidia driver version is needed to run the CUDA builds.
Table 1: CUDA Toolkit and Compatible Driver Versions and Table 2: CUDA Toolkit and Minimum Compatible Driver Versions as well as the nvidia-smi
command on the host will help us determine if a specific version of CUDA will be supported.
Not all GPUs are supported in the Docker Hub builds. The containers are built for "compute capability (version)" (as defined in the GPU supported Wikipedia page) of 6.0 and above (ie Pascal and above).
If we need a different GPU compute capability, we can edit the Makefile
and alter the various DNN_ARCH_
matching the one that we need to build and add the needed architecture. Then type make
to see the entire list of containers that the release we have obtained can build and use the exact tag that we want to build to build it locally (on Ubuntu, we will need docker
and build-essential
installed --at least-- to do this).
Building a container image takes a lot of CPU and can take multiple hours, so we recommend to build only the target needed.
Build or obtain the container image required from DockerHub.
We understand the image names are verbose. This is to avoid confusion between the different builds.
It is possible to tag
containers with shorter names for easy docker run
.
The WORKDIR
for the containers is set as /iti
, as such, to map the current working directory within the container and test functions, we can -v
as /iti
.
When using a GPU image, make sure to add --gpus all
to the docker run
command line.
For example to run the GPU-Jupyter container and expose the WebUI to port 8765, one would:
% docker run --rm -v `pwd`:/iti --gpus all -p 8765:8888 infotrend/ctpo-jupyter-cuda_tensorflow_pytorch_opencv:11.8.0_2.12.0_2.0.1_4.7.0-20231120
By going to http://localhost:8765 we will be shown the Jupyter Log in
page. As a reminder, the default token is iti
.
After log in, we will see the Jupyter Lab interface and the list of files mounted in /iti
in the interface.
From that WebUI, using File -> Shutdown
will exit the container.
The non-Jupyter containers are set to provide the end users with a bash
.
pwd
-mounting the /iti
directory to a directory where the developer has some code for testing enables the setup of a quick prototyping/testing container-based environment.
For example to run some of the content of the test
directory on a CPU, in the directory where this README.md
is located:
% docker run --rm -it -v `pwd`:/iti infotrend/ctpo-tensorflow_opencv:2.12.0_4.7.0-20231120
[this starts the container in interactive mode and we can type command in the provided shell]
root@b859b8aced9c:/iti# python3 ./test/tf_test.py
Tensorflow test: CPU only
On CPU:
tf.Tensor(
[[22. 28.]
[49. 64.]], shape=(2, 2), dtype=float32)
Time (s) to convolve 32x7x7x3 filter over random 100x100x100x3 images (batch x height x width x channel). Sum of ten runs.
CPU (s): 0.483618629979901
Tensorflow test: Done
Note that the base container runs as root
.
To run it as a non-root user, add -u $(id -u):$(id -g)
to the docker
command line and ensure access to the directories we will work in.
The built image is compatible with other GPU-compatible container runtimes, such as podman
.
Follow the instructions to install the NVIDIA Container Toolkit and Support for Container Device Interface.
We will need a version of podman
above 4.1.0 to be able to:
% podman run -it --rm --device nvidia.com/gpu=all infotrend/ctpo-cuda_tensorflow_pytorch_opencv:latest
root@2b8d77a97c5b:/iti# python3 /iti/test/pt_test.py
Tensorflow test: GPU found
On cpu:
[...]
On cuda:
[...]
root@2b8d77a97c5b:/iti# touch file
, that last command will create a file
owned by the person who started the container.
ℹ️ On Ubuntu 22.04, install HomeBrew and
brew install podman
, which at the time of this writeup provided version 4.8.2
It is also possible to run the container in docker compose
.
Follow the GPU support instructions to match the usage, and adapt the following compose.yml
example as needed:
version: "3.8"
services:
jupyter_ctpo:
container_name: jupyter_ctpo
image: infotrend/ctpo-jupyter-cuda_tensorflow_pytorch_opencv:latest
restart: unless-stopped
ports:
- 8888:8888
volumes:
- ./iti:/iti
- ./home:/home/jupyter
environment:
- TZ="America/New_York"
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=all
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
When using the Jupyter version of CTPO with other users, it might be better to use a virtualenv
for the packages to be installed in.
In the following, we will create a myvenv
virtual environment in the /iti
directory, that will show up in the list of available kernels.
In a Terminal (preferably running a bash
shell) in the Jupyter Lab, run:
python3 -m venv --system-site-packages myvenv
source myvenv/bin/activate
pip3 install ipykernel
python -m ipykernel install --user --name=myvenv --name=myvenv --display-name="Python (myvenv)"
Make sure to select the proper kernel in the notebook.
When using this kernel, it is still recommendeded to run any pip
command from the terminal with the virtual environment activated to ensure the packages are installed in the expected location (i.e. not a global installation). As an alternative for pip
commands to run with the proper installation directory, we will need to use ! . ./myvenv/bin/activate
before the command. For example: !. ./myvenv/bin/activate; pip install -r requirements.txt
If "default-runtime": "nvidia"
in set in /etc/docker/daemon.json
and want to hide the GPUs from a running container, add NVIDIA_VISIBLE_DEVICES=void
before the docker run
command.
- 20241219: First Ubuntu 24.04 based release (with Python 3.12), with support for CUDA 12.5.1, TensorFlow 2.18.0, PyTorch 2.5.1 and OpenCV 4.10.0 (and a Dockerfile to build with TensorRT)
- 20241125: Release with support for CUDA 12.5.1, TensorFlow 2.18.0, PyTorch 2.5.1 and OpenCV 4.10.0 (last Ubuntu 22.04 based version, with Python 3.10)
- 20240421: Release with support for CUDA 12.3.2, TensorFlow 2.16.1, PyTorch 2.2.2 and OpenCV 4.9.0
- 20231201: Release with support for CUDA 11.8.0, TensorFlow 2.14.1, PyToch 2.1.1 and OpenCV 4.8.0
- 20231120: Initial Release, with support for CUDA 11.8.0, TensorFlow 2.12.0, PyTorch 2.0.1 and OpenCV 4.7.0.
- November 2023: Preparation for public release.
- June 2023: engineered to support clean
Dockerfile
generation, supporting the same versions as 20231120 releases.