Written by Yuanhe Guo ([email protected])
A beginner guide for getting started with running python on nyu greene hpc.
Each time I want to setup a new environment on NYU Greene HPC, I have to go through the official documentation and search for the commands I need. Meanwhile, there exist quite a few common issues whose solutions are not covered by the offical documentation. So I decided to write a cheatsheet for myself and others who may need it.
I wrote some complex commands into bash scripts, so that getting python run flawlessly on NYU Greene HPC is as simple as ordering a burger in a fast food restaurant.
Check the wiki page to get started.
- The official documentation: HPC Home
- I started my journey with HPC by following HPC Notes by Hammond Liu
-
[2025.8.1] Added custom CUDA version support. Singularity will be built from scratch from Docker Hub. Note that this singularity has no cuDNN so you will need to install cuDNN yourself. The easiest way is to go to Nvidia cuDNN Archive, download the version you want, scp to greene and install it.
-
[2025.5.29] A [y]/n option is added when the user is prompted to create a new environment. A new folder will only be created if the user enters "y" or "yes". This is to prevent creating unnecessary folders when the user made a typo when activating existing environment.
-
[2024.6.13] The file structure has undergone a significant change. Now you can clone the entire repo. If you have used this cheatsheet before, please first remove
chsdevice.sh
andchslauncher.sh
on your hpc. Then follow the updated wiki instruction to setup your environment. New features are as following:- The new file structure is more organized and easier to maintain.
- A new run_setup.sh script is added to help you setup
~/.bashrc
file automatically. - Added options for H100 GPU and cuda 12.1
-
[2024.4.7] Bug fix for handling conda installation failure in Lazy Launcher. Check [[ this part of troubleshooting|Trouble Shooting#conda-environment-installation-failed ]] for detail.
Please feel free to open an issue if you have any questions or suggestions.
- Prereq
- Apply for NYU Greene HPC access
- Basic Linux commands
- Vim
- Vscode
- Quick Starting Pack
- Connect to HPC
- Request CPU/GPU Sessions
- Interactive sessions for conda
- Jupyter Notebook
- Batch jobs
- Manual Setup
- Offical guide Index
- Trouble Shooting
- How can I quit python/singualrity/runtime?
- How can I jump back when kicked off by accident?
- Disk quota exceeded
- Could not login server through vscode
- Out of Memory Error
- Could not open singularity environment
- Some linux commands could not be executed
- Advanced Topics (Useful Tricks)
- Setup bashrc
- Setup ssh key pairs
- Collection of useful linux commands
- Sharing files with Other HPC Users
- Sending files to/from HPC
- SSH Tunneling on GPU Nodes
- AWS S3 Connection
- Access through iPad
- Using in-node memory for faster training(move to Advanced Topics)
- Distributed training on multi-node using
RDZV
,srun -W
andtorchrun
- Submitting Topology-aware GPU jobs for NCCL-heavy training
- Use SLURM Job array to sweep hyper parameter and random seed