- What is a cluster?
- How do I connect to the cluster?
- When should I use a cluster?
- Synchronous parallelism
- Slurm
- Managing research data
- Amii: What is a cluster? - A brief description of the components of an HPC cluster.
- Wikipedia - An introduction of HPC.
- Nvidia: MIG user guide documentation - A user guide documentation of multi-instance GPU from Nvidia.
- Current State of Advanced Research Computing in Canada, May 2021 - Compute, storage, communication data of Compute Canada HPCs.
- Ubuntu HPC - A Youtube video introducing cluster components.
- Top 500 list - The data of top 500 listed supercomputers, which gives a scope of how fast and how much compute nodes are, and also introduction and comparison among these computers.
- Iowa State University HPC Guides - A collection of introductory guides for HPC including SLURM, globus, unix, python, julia, and much more.
- An HPC User Guide - A collection of introductory guides for HPC including connection nodes, SLURM, job examples and applications.
- Compute Canada: an introduction to HPC - A Youtube video introducing high-performance computing with the Compute Canada network, first providing an overview of use cases for HPC and then a hands-on tutorial.
- Compute Canada: Cedar - An introduction of Cedar, a heterogeneous cluster located at Simon Fraser University.
- Compute Canada: Graham - An introduction of Graham, a heterogeneous cluster located at the University of Waterloo.
- Compute Canada: Narval - An introduction of Narval, a general purpose cluster located at the École de technologie supérieure in Montreal.
- Compute Canada: Béluga - An introduction of Béluga, a general purpose cluster situated at the École de technologie supérieure in Montreal.
- Compute Canada: Niagara - An introduction of Niagara, a homogeneous cluster owned by the University of Toronto and operated by SciNet.
- Compute Canada: Systems available - A brief introduction for the five systems of Compute Canada and the clusters' types.
- Compute Canada: Allocations and compute scheduling - A brief tutorial of choosing GPU models for your project.
- SSH quickstart - Step-by-step ssh and scp guide.
- Compute Canada: SSH - An introduction of Secure Shell used to connect to remote machines securely.
- Compute Canada: Transferring data - Overview of options for transferring data to and between clusters (Globus, rsync, scp, etc.).
- Compute Canada: Globus - Setup and tutorial for Globus with Compute Canada.
- Globus docs - Globus tutorials and documentations.
- Globus video tutorial - Youtube video introduction to Globus for researchers and new users.
- VS Code remote development - Visual studio code remote development guide.
- Rsync quickstart - Step-by-step rsync guide.
- Compute Canada: Interactive jobs - Compute Canada documentation for running interactive jobs.
- Github SSH authentication - Connecting Github account to your computer and the cluster.
- git - the simple guide - A short introduction and Step-by-step git guide.
- Compute Canada: Storage and file management - An introduction of a wide range of storage options to cover the needs of very diverse users of Compute Canada.
- Job scheduling policies - Time limits of each cluster and number of jobs, helping identify recognize what type of job may use a cluster.
- The necessity of checkpointing - When should we use checkpointing.
- Dealing with heavy computing - An example of when and how to use compute canada.
- When to use HPC cluster - Typical cases of when it may be beneficial to request access to an HPC cluster.
- Tensorflow: Distributed training - Introduction and tutorial of Multi-GPU and distributed training.
- Wikipedia: GNU parallel - Introduction and user's guide of GNU parallel.
- Compute Canada: GNU parallel - An introductory tutorial of using GNU parallel.
- GNU parallel tutorial documentation - A tutorial of GNU parallel, including its functionality, options and syntax.
- Open-MPI documentation - A series of versions for Open MPI documentations.
- Open MPI v5.0.x - The documentation of current release Open MPI.
- SLURM: MPI users guide - An introductory tutorial of various MPI implementations.
- Github: Open-MPI tutorial - A tutorial of using Open MPI.
- Compute Canada: MPI-IO - Description and using tutorial of MPI-IO.
- jax.pmap - Step-by-step jax.pmap guide.
- Compute Canada: Flax - Introduction and tutorial of Flax, a neural network library and ecosystem for JAX that is designed for flexibility, including the guidance of using jax.pmap.
- Compute Canada: Checkpoints - A tutorial of creating and loading a checkpoint.
- Compute Canada: Machine Learning tutorial - A tutorial of checkpointing a long-running job.
- Tensorflow: Checkpoint guide - Guidance for training checkpoints while using Tensorflow.
- Compute Canada: Lustre - An introduction and tutorial of Lustre Filesystem.