Language and Voice Laboratory Computing Resources

Introduction

The Language and Voice Laboratory (LVL) runs a tiny computing “cluster” called Terra. This cluster consists of two physical nodes, terra and torpaq.

Access is granted by request by a sysadmin in the LVL. Once you have a user account you can log into the main node:

ssh [email protected]

Any questions additional questions can be asked on the Compute channel on Teams.

A short slideshow with examples and explinations is available here.

Scheduler

The LVL cluster uses Slurm to handle compute job scheduling and resource allocation. All resource intensive tasks must use the scheduling system, and please refrain from requesting way more resources than is necessary.

The command sbatch is used to submit batch jobs to the scheduler. This is the most common way to run tasks on the cluster. A batch job is described by a batch script and the command-line arguments to sbatch.

A batch script is a bash script with some special preprocessor directives, as seen in the example below.

#!/bin/bash
#SBATCH --gres=gpu:titanx:2
#SBATCH --mem=12G
#SBATCH --output=test-sbatch.log
echo "I have these GPUs:" $CUDA_VISIBLE_DEVICES
echo "On this machine" $(hostname)
exit 0

We send this job to the scheduler with

sbatch example-job.sbatch

This defines a job that will request two NVidia Titan X GPUs, 12 GB of memory and write stdout/stderr to the file test-sbatch.log in the current directory. Once the scheduler is able to allocate the necessary resources it will execute the job, writing the IDs of the allocated GPUs and the hostname of the allocated node to test-sbatch.log.

We can use sacct to see the job history and squeue to see queued and running jobs.

Storage

There are a few file systems available on Terra. None of these are backed up. All, except /scratch, are raided for fault-tolerance.

Due to the nature of home, reading and writing to it slows Terra cluster down for everyone. So, do most of your modelling work or intensive reading/writing on /scratch or /work . Home should only be for your code repos, configuration files, etc. Your model and data directories should always be on /scratch or /work.

Mount path	Purpose	Size	Speed	local node
/data	Shared datasets, models and archives. Read-only for users.	2.7 TiB	Fast reads & slow writes	terra
/scratch	“Unimportant” temporary files with many writes and reads.	2 TiB	Fastest	terra
/mnt/scratch	Links to /scratch for legacy reasons
/work	More important temporary files	3.4 TiB	Fastest reads & fast writes	torpaq
/home	Code, configuration files, etc	5.4T	Slow	terra

Useful places

Users have access to a few read-only folders on Terra. These places are meant to store frequently used corpora, models and tools.

Path	Purpose
/data	Datasets and data used by and created by LVL
/models	Pretrained models from LVL or other sources
/data/tools	Shared tools and libraries

If you want to add your own or additional data, models or libraries contact the admins.

Containers

Singularity (FAQ) is a container solution for scientific computing that allows unprivileged use of containers. Singularity supports building its own images from scratch and ready-made Docker images.

A user can build their own containerized application/project on there own machines which can be run on Terra in a Slurm batch job.

Jupyter Notebooks (JupyterHub)

Jupyter notebooks have become a popular way of doing scientific computing and interactive machine learning.

LVL runs a JupyterHub accessible at https://terra.hir.is (RU intranet, you’ll have to accept the self-signed cert) which allows users to spin up notebook servers through Slurm.

The notebook server runs in a container using an image with a Python 3.7 Conda base environment. The Conda tab allows you to create new environments, and new packages can be added to enviroments through the UI or in a notebook using a specific environment.

Installing software

An easy way for a user to install necessary tools and libraries, other than compiling things yourself, is to use the Conda package manager.

To use it you first have to add it to your environment:

source /data/tools/anaconda/etc/profile.d/conda.sh

Then, to always have conda available you can add it to your bash profile with:

conda init

Let’s say that for some reason you need to use pdftotext from poppler-utils, then you can create and environment specifically for that:

conda create -n pdf-stuff poppler-utils

This will create an environment named pdf-stuff with the package poppler-utils and all of its dependencies installed. To activate it you run:

conda activate pdf-stuff

To verify that it has been loaded:

whereis pdftotext

pdftotext: /home/staff/rkjaran/.conda/envs/test-poppler-env/bin/pdftotext

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.gitignore		.gitignore
README.org		README.org
known-problems.org		known-problems.org
slurm-usage.org		slurm-usage.org

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Language and Voice Laboratory Computing Resources

Table of Contents

Introduction

Scheduler

Storage

Useful places

Containers

Jupyter Notebooks (JupyterHub)

Installing software

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

cadia-lvl/compute

Folders and files

Latest commit

History

Repository files navigation

Language and Voice Laboratory Computing Resources

Table of Contents

Introduction

Scheduler

Storage

Useful places

Containers

Jupyter Notebooks (JupyterHub)

Installing software

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Packages