JupyterLab image with VisualStudio Code server integrated, based on the jupyter/docker-stacks scipy image, with additional packages and kernels installed for data science and knowledge graphs.
List of features for the images available running on CPU
This is the base image with useful interfaces and libraries for data science preinstalled:
📋️ VisualStudio Code server is installed, and accessible from the JupyterLab Launcher
🐍 Python 3.8 with notebook kernel supporting autocomplete and suggestions (jupyterlab-lsp)
☕️ Java OpenJDK 11 with IJava notebook kernel
🐍 Conda and mamba are installed, each conda environment created will add a new option to create a notebook using this environment in the JupyterLab Launcher (with nb_conda_kernels). You can create environments using different version of Python if necessary.
🧑💻 ZSH is used by default for the JupyterLab and VisualStudio Code terminals
The following JupyterLab extensions are also installed: jupyterlab-git, jupyterlab-system-monitor, jupyter_bokeh, plotly, jupyterlab-spreadsheet, jupyterlab-drawio.
Extended from ghcr.io/maastrichtu-ids/jupyterlab:latest, it contains
✨️ SPARQL kernel to query RDF knowledge graphs
✨️ Apache Spark and PySpark are installed for distributed data processing
💎 OpenRefine is installed, and accessible from the JupyterLab Launcher
🦀 Oxigraph SPARQL database
⚡️ Blazegraph SPARQL database
☕️ Java .jar programs for knowledge graph processing are pre-downloaded in the /opt folder, such as RDF4J, Apache Jena, OWLAPI, RML mapper.
📈 R kernel
With those docker images, you can optionally provide the URL to a git repository to be automatically cloned in the workspace at the start of the container using the environment variable GIT_URL
The following files will be automatically installed if they are present at the root of the provided Git repository:
- The conda environment described in
environment.ymlwill be installed, make sure you addedipykernelandnb_conda_kernelsto theenvironment.ymlto be able to easily start notebooks using this environment from the JupyterLab Launcher page. See this repository as example. - The python packages in
requirements.txtwill be installed withpip - The debian packages in
packages.txtwill be installed withapt-get - The JupyterLab extensions in
extensions.txtwill be installed withjupyter labextension
You can also create a conda environment from a file in a running JupyterLab (we use mamba which is like conda but faster):
mamba env create -f environment.ymlYou'll need to wait a minute before the new conda environment becomes available on the JupyterLab Launcher page.
The easiest way to build a custom image is to extend the existing images.
For notebooks running on CPU, we use images from the official jupyter/docker-stacks, which run as non root user. So you will need to make sure the folders permissions are properly set for the notebook user.
Here is an example Dockerfile to extend ghcr.io/maastrichtu-ids/jupyterlab:latest:
FROM ghcr.io/maastrichtu-ids/jupyterlab:latest
# Change to root user to install packages requiring admin privileges:
USER root
RUN apt-get update && \
apt-get install -y vim
RUN fix-permissions /home/$NB_USER
# Switch back to the notebook user for other packages:
USER ${NB_UID}
RUN mamba install -c defaults -y rstudio
RUN pip install jupyter-rsession-proxyFor docker image that are not based on the jupyter/docker-stack, such as the GPU images, you the root user is used by default. See at the further in this README for more information on how to extend GPU images.
For the ghcr.io/maastrichtu-ids/jupyterlab:latest image volumes should be mounted into /home/jovyan/work folder.
This command will start JupyterLab as jovyan user with sudo privileges, use JUPYTER_TOKEN to define your password:
docker run --rm -it --user root -p 8888:8888 -e GRANT_SUDO=yes -e JUPYTER_TOKEN=password -v $(pwd)/data:/home/jovyan/work ghcr.io/maastrichtu-ids/jupyterlabYou should now be able to install anything in the JupyterLab container, try:
sudo apt-get update
You can check the docker-compose.yml file to run it easily with Docker Compose.
Run with a restricted jovyan user, without sudo privileges:
docker run --rm -it --user $(id -u) -p 8888:8888 -e CHOWN_HOME=yes -e CHOWN_HOME_OPTS='-R' -e JUPYTER_TOKEN=password -v $(pwd)/data:/home/jovyan/work ghcr.io/maastrichtu-ids/jupyterlab:latest
⚠️ Potential permission issue when running locally. The official jupyter/docker-stacks images use thejovyanuser by default which does not grant admin rights (sudo). This can cause issues when writing to the shared volumes, to fix it you can change the owner of the folder, or start JupyterLab as root user.To create the folder with the right permissions, replace
1000:100by your username:group if necessary and run:mkdir -p data/ sudo chown -R 1000:100 data/
Instructions to build the various image aiming to run on CPU.
This repository contains multiple folders with Dockerfile to build various flavor of JupyterLab for Data Science.
With Python 3.8, conda integration, VisualStudio Code, Java and SPARQL kernels
Build:
docker build -t ghcr.io/maastrichtu-ids/jupyterlab .Run:
docker run --rm -it --user root -p 8888:8888 -e JUPYTER_TOKEN=password -v $(pwd)/data:/home/jovyan/work ghcr.io/maastrichtu-ids/jupyterlabPush:
docker push ghcr.io/maastrichtu-ids/jupyterlabWith Oxigraph and Blazegraph SPARQL database, and additional python/java library for RDF processing:
docker build -f knowledge-graph/Dockerfile -t ghcr.io/maastrichtu-ids/jupyterlab:knowledge-graph .
docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:knowledge-graphWith a python2.7 kernel only (python3 not installed). Build and run (workdir is /root):
docker build -t ghcr.io/maastrichtu-ids/jupyterlab:python2.7 ./python2.7
docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:python2.7Based on https://github.com/bruggerk/ricopili_docker. Build and run (workdir is /root):
docker build -t ghcr.io/maastrichtu-ids/jupyterlab:ricopili ./ricopili
docker run --rm -it -p 8888:8888 -v $(pwd)/data:/root -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:ricopiliBuilt with https://github.com/ReproNim/neurodocker. Build and run (workdir is /root):
docker build -t ghcr.io/maastrichtu-ids/jupyterlab:fsl ./fsl
docker run --rm -it -p 8888:8888 -v $(pwd)/data:/root -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:fslTo deploy JupyterLab on GPU we use the official Nvidia images, we defined the same gpu.dockerfile to install additional dependencies, such as JupyterLab and VisualStudio Code, with different images from Nvidia:
🗜️ TensorFlow with nvcr.io/nvidia/tensorflow:
ghcr.io/maastrichtu-ids/jupyterlab:tensorflow
🔥 PyTorch with nvcr.io/nvidia/pytorch:
ghcr.io/maastrichtu-ids/jupyterlab:pytorch
👁️ CUDA with nvcr.io/nvidia/cuda:
ghcr.io/maastrichtu-ids/jupyterlab:cuda
Volumes should be mounted into the /workspace/persistent or /workspace folder.
The easiest way to build a custom image is to extend the existing images.
Here is an example Dockerfile to extend ghcr.io/maastrichtu-ids/jupyterlab:tensorflow based on nvcr.io/nvidia/tensorflow:
FROM ghcr.io/maastrichtu-ids/jupyterlab:tensorflow
RUN apt-get update && \
apt-get install -y vim
RUN pip install jupyter-tensorboardYou will find here the commands to use to build our different GPU docker images, most of them are using the gpu.dockerfile
Change the build-arg and run from the root folder of this repository:
docker build --build-arg NVIDIA_IMAGE=nvcr.io/nvidia/cuda:11.4.2-devel-ubuntu20.04 -f gpu.dockerfile -t ghcr.io/maastrichtu-ids/jupyterlab:cuda .Run an image on http://localhost:8888
docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password -v $(pwd)/data:/workspace/persistent ghcr.io/maastrichtu-ids/jupyterlab:cudaChange the build-arg and run from the root folder of this repository:
docker build --build-arg NVIDIA_IMAGE=nvcr.io/nvidia/pytorch:23.03-py3 -f gpu.dockerfile -t ghcr.io/maastrichtu-ids/jupyterlab:pytorch .Run an image on http://localhost:8888
docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password -v $(pwd)/data:/workspace/persistent ghcr.io/maastrichtu-ids/jupyterlab:pytorchChange the build-arg and run from the root folder of this repository:
docker build --build-arg NVIDIA_IMAGE=nvcr.io/nvidia/tensorflow:21.11-tf2-py3 -f gpu.dockerfile -t ghcr.io/maastrichtu-ids/jupyterlab:tensorflow .Run an image on http://localhost:8888
docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password -v $(pwd)/data:/workspace/persistent ghcr.io/maastrichtu-ids/jupyterlab:tensorflowThis build use a different image, go to the fsl-gpu folder. And check the README.md for more details.
Build:
docker build -t ghcr.io/maastrichtu-ids/jupyterlab:fsl-gpu ./fsl-gpuRun (workdir is /workspace):
docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:fsl-gpuThis image is compatible with OpenShift and OKD security constraints to run as non root user.
We recommend to use this Helm chart to deploy these JupyterLab images on Kubernetes or OpenShift: https://artifacthub.io/packages/helm/dsri-helm-charts/jupyterlab
If you are working or studying at Maastricht University, you can easily deploy this notebook on the Data Science Research Infrastructure (DSRI) 🌉
Choose which image fits your need: latest, tensorflow, cuda, pytorch, freesurfer, python2.7...
-
Fork this repository.
-
Clone the forked repository
-
Edit the
Dockerfilefor the image you want to improve. Preferably usemambaorcondato install new packages, you can also install withapt-get(need to run as root or withsudo) andpip -
Go to the folder and rebuild the
Dockerfile:
docker build -t jupyterlab -f Dockerfile .- Run the docker image built on http://localhost:8888 to test it
docker run -it --rm -p 8888:8888 -e JUPYTER_TOKEN=yourpassword jupyterlabIf the built Docker image works well feel free to send a pull request to get your changes merged to the main repository and integrated in the corresponding published Docker image.
You can check the size of the image built in MB:
expr $(docker image inspect ghcr.io/maastrichtu-ids/jupyterlab:latest --format='{{.Size}}') / 1000000