JupyterHub + nbgrader + docker-compose

Status

PLEASE REFER TO illumidesk/illumidesk IF YOU ARE INTERESTED IN A MAINTAINED VERSION OF THIS SETUP (PLUS MORE GOODIES 😄 )

Proof of Concept (POC)

Prerequisites

On remote host:

Ubuntu 18.04

On machine running ansible-playbook:

Ansible >= 2.8

Quick Start

Create a ansible/hosts file from the provided ansible/hosts.example:

cp ansible/hosts.example ansible/hosts
Update private_key_file in the hosts file with the full path to the PEM key to access your instance.
Update ansible/ansible.cfg with your IPv4 address.
Run ansible playbook:

In it's basic form:

ansible-playbook \
  provisioning.yml \
  private_key_file=/path/to/my/private/pem \
  remote_user = ubuntu

With custom variables and with verbose output:

ansible-playbook \
  provisioning.yml \
  --extra-vars \
  private_key_file=/path/to/my/private/pem \
  remote_user = ubuntu \
  "org_name=myedu"
  -v

All global variable names are listed in ansible/group_vars/all.yml.

Artifacts

JupyterHub: Runs JupyterHub within a Docker container running as root.
Authenticator: Authentication service. This setup uses a customized version of the FirstUseAuthenticator.
Spawner: Spawning service to manage user notebooks. This setup uses a customized version of the DockerSpawner.
Data Directories: Data directories for configuration files, databases, etc rely on the containers themselves or by mounting directories from the host.
Databases: This setup relies on the default SQLite databases for both JupyterHub and nbgrader.
Network: An external bridge network named jupyter-network is used by default.

Customization

Configuration Files

The configuration changes depending on how you decide to update this setup. Essentially customizations boil down to:

JupyterHub configuration using jupyterhub_config.py:
- Authenticators
- Spawners
- Services

Note: By default the jupyterhub_config.py file is located in /etc/jupyter/jupyterhub_config.py within the running JupyterHub container, however, if you change this location (which would require an update to the JupyterHub's Dockerfile) then you need to make sure you are using the correct configuration file with the jupyterhub -f /path/to/jupyterhub_config.py option.

Whenever possible we try to adhere to JupyterHub's recommended paths:

/srv/jupyterhub for all security and runtime files
/etc/jupyterhub for all configuration files
/var/log for log files

Nbgrader configurations using nbgrader_config.py.

Three nbgrader_config.py files should exist, two for the shared grader account and one for each instructor/learner account:

Grader Account

Grader's home: /home/grader-{course_id}/.jupyter/nbgrader_config.py: defines how nbgrader authenticates with a third party service, such as JupyterHub using the JupyterHubAuthPlugin, the log file's location, and the course_id the grader account manages.
Grader's course: /home/grader-{course_id}/{course_id}/nbgrader_config.py: configurations related to how the course files themselves are managed, such as solution delimeters, code stubs, etc.

Instructor/Learner Account

Instructor/Learner settings /etc/jupyterhub/nbgrader_config.py: defines how nbgrader authenticates with a third party service, such as JupyterHub using the JupyterHubAuthPlugin, the log file's location, etc. Instructor and learner accounts do NOT contain the course_id identifier in their nbgrader configuration files.

Note: nbgrader utilizes the directory structure within the exchange root (for example /srv/nbgrader/exchange) to list courses the instructors and learners have access to. If you don't see a particular course listed in the instructor's course list extension then make sure the course directory exists and that it has the right permissios.

Jupyter Notebook configuration using jupyter_notebook_config.py. This configuration does contain some customizations to create user directories and assigne user roles using Authenticator hooks, otherwise its standard fare.
For this setup, the deployment configuration is defined primarily with docker-compose.yml.

Build the Stack

The following docker images are created with the playbook:

Jupyter Notebook Student image
Jupyter Notebook Instructor image
Jupyter Notebook shared Grader image
JupyterHub image
JupyterHub configurable-http-proxy image (pulled)

When building the images the configuration files are copied to the image from the host using the COPY command. Environment variables are stored in env.* files. You can either customize the environment variables within the env.* files or add new ones as needed. The env.* files are used by docker-compose to reduce the file's verbosity.

Authenticator

To change the authenticator used by JupyterHub, edit the JupyterHub.authenticator_class setting in jupyterhub_config.py to use your desired authenticator. For example, to change authenticator from DummyAuthenticator to FirstUseAuthenticator:

Before:

c.JupyterHub.authenticator_class = 'dummyauthenticator.DummyAuthenticator'

After:

c.JupyterHub.authenticator_class = 'firstuseauthenticator.FirstUseAuthenticator'

For local dev testing, we recommend the DummyAuthenticator since it provides a simple way to log in without having to worry about setting password policies and the like.

Consult the authenticator's official documentation to ensure that all configurations are set before launching JupyterHub.

Spawner

By default this setup includes the CustomDockerSpawner class which extends the DockerSpawner class. This implementation calls the authenticator function to get check for the user's group membership and uses the pre_spawn_hook to set the user's image based on user role.

Edit the JupyterHub.spawner_class to update the spawner used by JupyterHub when launching user containers. For example, if you are changing the spawner from DockerSpawner to KubeSpawner:

Before:

c.JupyterHub.spawner_class = 'dockerspawner.DockerSpawner'

After:

c.JupyterHub.spawner_class = 'kubespawner.KubeSpawner'

As mentioned in the authenticator section, make sure you refer to the spawner's documentation to consider all settings before launching JupyterHub.

Proxies

Users connect to the proxy, not directly to JupyterHub. JupyterHub updates the proxies' routing tables using an internal facing port (8001 by default). Users connect to the proxy using an external facing port (8000 by default).

This setup use JupyterHub's configurable-http-proxy running in a separate container which enables JupyterHub restarts without interrupting active sessions between end-users and their Jupyter Notebooks. For the sake of simplicity, this setup does not include TSL termination at the proxy.

Jupyter Notebook Images

Requirements

The Jupyter Notebook image needs to have JupyterHub installed and this version of JupyterHub must coincide with the version of JupyterHub that is spawing the Jupyter Notebook. By default the jupyter/docker-stacks images have JupyterHub installed.
Use one of images provided by the jupyter/docker-stacks images as the base image.
Make sure the image is on the host used by the spawner to launch the user's Jupyter Notebook.

There are four notebook images:

The nbgrader extensions are enabled within the images like so:

	Students	Instructors	Formgraders
Create Assignment	no	no	yes
Assignment List	yes	yes	no
Formgrader	no	no	yes
Course List	no	yes	no

Refer to this section of the nbgrader docs for more information on how you can enable and disable specific extensions.

Grading with Multiple Instructors

As of nbgrader 0.6.0, nbgrader supports the JupyterHubAuthPlugin to determine the user's membership within a course. The section that describes how to run [nbgrader with JupyterHub] is well written. However for the sake of clarity, some of the key points and examples are written below.

The following rules are defined to determine access to nbgrader features:

Users with the student role are members of the nbgrader-{course_id} group(s). Students are shown assignments only for course(s) with {course_id}.
Users with the instructor role are members of the formgrade-{course_id} group(s). Instructors are shown links to course(s) to access {course_id}. To access the formgrader, instructors access to the {course_id} service (essentially a shared notebook) and authenticate to the {course_id} service using JupyterHub as an OAuth2 server.

NOTE It's important to emphasize that with this setup instructors do not grade assignments with their own notebook server but with a shared notebook which runs as a JupyterHub service and which is owned by the shared grader-{course_id} account.

The configuration for this setup requires one jupyterhub_config.py and three nbgrader_config.py's.

Within jupyterhub_config.py which defines the shared grader service:

Name
Access by group
Ownership
API token
URL
Command

For example for a course named intro101:

c.JupyterHub.services = [
    {
        'name': 'intro101',
        'url': 'http://intro101:8888',
        'oauth_no_confirm': True,
        'admin': True,
        'api_token': 'my_secure_token',
    },
]

The global nbgrader_config.py used by all roles, located in /etc/jupyterhub/nbgrader_config.py which defines:

Authenticator plugin class
Exchange directory location

For example:

c.Exchange.path_includes_course = True
c.Exchange.root = '/srv/nbgrader/exchange'
c.Authenticator.plugin_class = JupyterHubAuthPlugin

The nbgrader_config.py located within the shared grader account home directory: (/home/grader-{course_id}/.jupyter/nbgrader_config.py) which defines:

Course root path
Course name

For example:

c.CourseDirectory.root = '/home/grader-intro101/intro101'
c.CourseDirectory.course_id = 'intro101'

The nbgrader_config.py located within the course directory: (/home/grader-{course_id}/{course_id}/nbgrader_config.py) which defines:

The course_id
Nbgrader application options

For example:

c.CourseDirectory.course_id = 'intro101'
c.ClearSolutions.text_stub = 'ADD YOUR ANSWER HERE'

Some Notes on Authentication, User Directories, and Local System Users

Whether or not the Authenticator and Spawner require a local system user can be a source of confusion. JupyterHub's default authenticators, LocalAuthenticator and PAMAuthenticator, require local system users.

This setup assumes that users are not local system users. Custom authentication classes should extend the base Authenticator class and override the authenticate method. Further customizations can be provided by overriding the normalize_username and check_whitelist methods.

Since users require their own directories to manage their files and folders, additional steps need to take place to create these directories without having to create local system users. In our opinion the best approach is to define a method to create user directories and assign this method to the Spawner.pre_spawn_hook located within jupyterhub_config.py to accomplish this task.

Gotchas

Permissions: student (i.e. Learner role) should not have access to the grader's directories, as that would give them access to the sources. Grader directories are assignment a different NB_UID (10001 by default) but rely on the same standard NB_GID.
Order matters for docker volume mounts: you need to create the grader's home directory before mounting the docker volume. Otherwise, the configs in the docker container's volume take precedence over the host directory configs. Nevertheless, if the docker mount creates the directory then said directory would have root:root for uid/gid, which would result in errors.
JupyterHub pre-flight API token(s): setting a pre-flight token JupyterHub.service_tokens removes the need for launching the stack, obtaining a token for a JupyterHub user, adding it to jupyterhub_config.py and restarting the JupyterHub service.
Notebook images tagged by user role: this setup creates an image for each user role instead of using a script to enable/disable the nbgrader extensions by appending or replacing the singleuser-notebook.sh script. The pros are better start times and it avoids possible race conditions. Cons are that setting up the authenticator(s) to obtain the USER_ROLE requires some extra setup.
Creating user directories: system users are not created with the DockerSpawner, however all users need their own directories. Therefore this setup uses the Spawner.pre_spawn_hook within the customized DockerSpawner to create user directories. As the hook name implies, this task is accomplished before spawning the end-user's container.
Assigning end-user notebook images by role: this setup uses the Spawner.pre_spawn_start within the customized FirstUseSpawner assign the user's notebook image. As the hook name implies, this task is accomplished before starting the end-user's container.

Environment Variables

The services included with this setup rely on environment variables to work properly. Although the ansible script does add sensible defaults for these environment variables you can override them by either setting the ansible veriable when running the playbook or my manually modifying the environment variable files on the remote host after the playbook has run.

Environment Variables pertaining to JupyterHub, located in `env.jhub`

Variable	Type	Description	Default Value
CONFIGURABLE_HTTP_PROXY	`string`	Random string used to authenticate the proxy with JupyterHub and vs.	`<random_string_value>`
DOCKER_LEARNER_IMAGE	`string`	Docker image used by users with the Learner role.	`myedu/notebook:learner`
DOCKER_GRADER_IMAGE	`string`	Docker image used by users with the Grader role.	`myedu/notebook:grader`
DOCKER_INSTRUCTOR_IMAGE	`string`	Docker image used by users with the Instructor role.	`custom/notebook:instructor`
DOCKER_STANDARD_IMAGE	`string`	Docker image used by users with no role.	`myedu/notebook:standard`
DOCKER_NETWORK_NAME	`string`	Docker network name for docker-compose and dockerspawner	`jupyter-network`
DOCKER_NOTEBOOK_DIR	`string`	Working directory for Jupyter Notebooks	`/home/jovyan`
EXCHANGE_DIR	`string`	Exchange directory path	`myedu.example.com`
JUPYTERHUB_CRYPT_KEY	`string`	Cyptographic key used to encrypt cookies.	`<random_value>`
JUPYTERHUB_API_TOKEN	`string`	API token used to authenticate grader service with JupyterHub.	`<random_value>`
JUPYTERHUB_API_TOKEN_USER	`string`	Grader service user which owns JUPYTERHUB_API_TOKEN.	`grader-{course_id}`
JUPYTERHUB_API_URL	`string`	Internal API URL corresponding to JupyterHub.	`http://jupyterhub:8081`
ORGANIZATION_NAME	`string`	Organization name.	`test`
LTI_CONSUMER_KEY	`string`	LTI 1.1 consumer key	`""`
LTI_SHARED_SECRET	`string`	LTI 1.1 shared secret	`""`

Environment Variables pertaining to grader service, located in `env.service`

Variable	Type	Description	Default Value
JUPYTERHUB_API_TOKEN	`string`	API token used to authenticate grader service with JupyterHub.	`<random_value>`
JUPYTERHUB_BASE_URL	`string`	JupyterHub base URL	`https://<org_name>.example.com/`
JUPYTERHUB_SERVICE_URL	`string`	Grader service internal URL	`http://<org_name>:8888`
JUPYTERHUB_SERVICE_NAME	`string`	JupyterHub internal service name	`jupyterhub`
JUPYTERHUB_API_URL	`string`	JupyterHub API URL	`http://jupyterhub:8081/hub/api`
JUPYTERHUB_CLIENT_ID	`string`	JupyterHub Client ID	`service-<course_id>`
JUPYTERHUB_SERVICE_PREFIX	`string`	JupyterHub service prefix	`/services/<course_id>`
JUPYTERHUB_USER	`string`	JupyterHub user	`grader-<course_id>`
NB_USER	`string`	Jupyter Notebook user	`grader-<course_id>`
NB_UID=10001	`string`	Jupyter Notebook Grader UID	`10001`
NB_GID=100	`string`	JupyterHub Notebook Grader GID	`100`

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
group_vars		group_vars
roles/common		roles/common
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE.docs		LICENSE.docs
README.md		README.md
ansible.cfg		ansible.cfg
hosts.example		hosts.example
provisioning.yml		provisioning.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

JupyterHub + nbgrader + docker-compose

Status

Prerequisites

Quick Start

Artifacts

Customization

Configuration Files

Build the Stack

Authenticator

Spawner

Proxies

Jupyter Notebook Images

Grading with Multiple Instructors

Some Notes on Authentication, User Directories, and Local System Users

Gotchas

Environment Variables

Environment Variables pertaining to JupyterHub, located in `env.jhub`

Environment Variables pertaining to grader service, located in `env.service`

Credits

About

Licenses found

Releases

Packages

Languages

License

Licenses found

jgwerner/jupyterhub-nbgrader-docker

Folders and files

Latest commit

History

Repository files navigation

JupyterHub + nbgrader + docker-compose

Status

Prerequisites

Quick Start

Artifacts

Customization

Configuration Files

Build the Stack

Authenticator

Spawner

Proxies

Jupyter Notebook Images

Grading with Multiple Instructors

Some Notes on Authentication, User Directories, and Local System Users

Gotchas

Environment Variables

Environment Variables pertaining to JupyterHub, located in env.jhub

Environment Variables pertaining to grader service, located in env.service

Credits

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Languages

Environment Variables pertaining to JupyterHub, located in `env.jhub`

Environment Variables pertaining to grader service, located in `env.service`

Packages