research_utils

Common commands and utils to take note of when working in lab.

Feel free to add to the README or add scripts that makes life easier here.

./pytorch_training: contains common pytorch training tips, tricks, and mistakes (dataloading, transforms, modes, etc.)

envrionment files

ncsnv2.yml

Environment.yml file for "Improved Techniques for Training Score-Based Generative Models." https://github.com/ermongroup/ncsnv2

SAM.yml

Environment.yml file for "Segment Anything" https://github.com/ermongroup/ncsnv2

transformers.yml

Environment.yml file for HuggingFace

uvcgan2.yml

Environment.yml file for "UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation" UVCGANv2

conda environments

# create a new conda environment
conda create --name <my-env>

# create environment from yml file
conda env create -f environment.yml

# export current environment to yml file
conda env export > environment.yml

# check list of currently available environments
conda info --envs

switching from conda (not free) to miniconda

double check you are using anaconda using by activating conda and using: which python3
backup all your environments (including base) activate the environment you want to back up and run the command: 'conda env export > environment.yml'
delete the conda folder in your home directory ./anaconda
install miniconda - during installation, allow it to install miniconda as a default
double check your .condarc and .bashrc file to see if bash will use miniconda
make sure in your miniconda folder, the .condarc doesn't use - https://repo.anaconda.com/pkgs/main or - https://repo.anaconda.com/pkgs/r in its channels
instead use the free channels: - https://repo.anaconda.com/pkgs/free and conda-forge
double check the channels in your miniconda environment with conda config --show channels
if default is still in your channels, use the command conda config --remove channels defaults

Jupyter Notebook:

Using Jupyter Notebook on server (MobaXTerm)

First set up an SSH tunnel with the following parameters (with your credentials): Then start the SSH tunnel, bash and then run the following command:

jupyter notebook --no-browser

Adding envrionments to Jupyter notebooks

# in the activated environment first install ipykernel:
conda install -c anaconda ipykernel
# then install the environment as a usable kernel:
python -m ipykernel install --user --name=env_name
# list the kernels available:
jupyter kernelspec list
# if you want to remove a kernel:
jupyter kernelspec uninstall kernel_name

changing the default start location of jupyter notebooks

# generate a jupyter config file if it's already not already there (/.jupyter/jupyter_notebook_config.py)
jupyter notebook --generate-config
# find the config file (.../.jupyter/jupyter_notebook_config.py) and modify the default notebook directory
c.NotebookApp.notebook_dir = 'path_to_new_dir'
# uncomment the notebook_dir and save.

for further details see the following link: How to change the Jupyter start-up folder

Tmux (running processes in the server without disconnecting)

# create a new session with a session name (easier to figure out which session is which)
tmux new -s session_name

#Detaching from Tmux Session:
Ctrl+b d

#Listing current tmux sessions:
tumx ls

#Attaching to tmux sessions:
tmux attach-session -t named_session

#Killing sessions:
tmux kill-session -t 3

# scrolling through errors in copy mode:
Ctrl+b,[

# renaming sessions:
tmux rename-session -t current_name new_name
OR: Ctrl+b,$ (within the session to rename it)

Cleaning up disk space:

check disk space to a directory:

du -sh directory_name
# to check hidden directories use:
du -sh .[^.]*

remove directory recursively:

rm -r directory_name

cleaning up cache

sometimes cache might be full and you might need to do some cleaning in the cache folder

# if ./.cache/pip/ is quite full you can do a purge:
python -m pip cache purge

And check out pip cache documentation for more information.

move conda envrionments to a different directory

If you have many conda environments and lots of packages, your home directory might get large. To reduce the disk space in home (or any) directory, change the conda environment's default directories to one in a larger storage center (like DatacenterStorage). Refer to this guide for specific directions: guide Otherwise a quick how-to are shown in the steps below:

Change default conda envrionment's pkgs_dirs and envs_dirs

# change Conda packages directory
conda config --add pkgs_dirs /big_partition/users/user/.conda/pkgs
# change Conda environments directory
conda config --add envs_dirs /big_partition/users/user/.conda/envs

if starting from scratch, this is enough and you can start creating envirionments and they will be saved to the new default directories
if you're wanting to move conda environment directories, there's no direct way so you have to do the following steps:
Archive environments.

conda env export -n foo > foo.yaml # One per environment.

Move package cache. (e.g. Copy contents of old package cache (/home/users/user_name/.conda/envs/.pkgs/) to new package cache.) This is mainly if you want to be very thorough about transferring and not having to redownload stuff for environments you already created.
Recreate environments.

conda env create -n foo -f foo.yaml

Copying to and from different servers

Use rsync: python rsync -r username@server1_IP:source_dir username@server2_IP:destination_dir

if the source or destination is a local:

Use scp:

scp username@serverIP:/server_dir/ local_dir  # server to local
scp local_dir username@serverIP:/server_dir/  # local to server

Copying to and from GCP

bash (to use gcloud)
Auth login first using: gcloud auth login --no-launch-browser
Do all the login stuff as necessary and use the authentication code.
Use gsutil to copy from one dir to another in the server: gsutil -m cp -r "gs://GCP_location /server_location/

Memory issues:

Memory not being released (cuda running out of memory error)

if you train a model and it runs for a few loops (batches or epochs) but then suddenly runs out of memory, the issue is probably that some variable is compounding and its memory is not being released. A few things you can do to debug is:

check memory usage after each loop using torch.cuda.memory_allocated()

# this code will print device memory usage in MB, i being batch or epoch number
print('batch {}: {:.2f}MB'.format(i, float(torch.cuda.memory_allocated(device=DEV) / (1024 * 1024))))

collect garbage and release cache (https://docs.python.org/3/library/gc.html)

import gc
import torch

gc.collect()
torch.cuda.empty_cache()

zero grad the optimizers - PyTorch accumulates the gradients on subsequent backward passes and if you don't the gradient would be a combination of the old gradient, which you have already used to update your model parameters and the newly-computed gradient. (https://stackoverflow.com/questions/48001598/why-do-we-need-to-call-zero-grad-in-pytorch)

optimizer.zero_grad(set_to_none=True)

make sure to add .items() not tensors in history or anything that will be evaluated at the end of the loop

# if loss is a tensor and used in gradient calculations then loss_sum will accumulate memory
loss = loss_fn(x, y)
loss_sum += loss
# print(loss) will give something like: "tensor(0.3652, device='cuda:0', grad_fn=<MulBackward0>)"

# the .item() of the tensor will just give the value and remove any gradient 
loss = loss_fn(x, y)
loss_sum += loss.item()
# print(loss.item()) will give something like: "0.3651849031448364"

pip running out of memory (can't install because of OS memory)

this might occur if you have a new conda environment and trying to install a separate pip and packages on it if so, try conda clean (Remove unused packages and caches.): https://docs.conda.io/projects/conda/en/latest/commands/clean.html

conda clean -a

Random Errors and quality of life stuff

RuntimeError: main thread is not in main loop

Sometimes when running optuna training, this error will occur and for me, it was because of matplotlib and plotting losses within the trials. See this issue for more context. So you can either remove the plotting during optuna training or use the following lines:

import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt

PIL image mode lookup

Most pytorch transforms and image manipulations will use PIL as the base so make sure the modes are correct (e.g. 8bit, 16bit, etc.) so that there's no clipping issues. See PIL documentation for more details.

quick lookup: RGB (3x8-bit pixels, true color), L (8-bit pixels, grayscale), I (32-bit signed integer pixels)

Plus Minus sign (±)

print('\u00B1')  # will give you the ±

Git issues:

commited large files to git repo and need to remove (blob.txt as an example):

Github won't let you push large files to your repo and if you somehow got to the limit and wanted to push something small that would put it over the limit, it will cause issues. However, since Github keeps the history of your commits with the files, the solution is not as simple as removing the large files from your repo. So you'll need to delete the large files from your repo history through BFG Repo-Cleaner. (There are other ways to remove from the history but for me, it was the easiest and most straight forward method).

Use BFG Repo-Cleaner: https://rtyley.github.io/bfg-repo-cleaner/ 3

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
GCP_AIF2.0		GCP_AIF2.0
dicom-anonymization		dicom-anonymization
dicom-extraction		dicom-extraction
images		images
md.ai		md.ai
optuna		optuna
project_guide		project_guide
python_environments		python_environments
pytorch_training		pytorch_training
simple_sklearn		simple_sklearn
README.md		README.md
SAM.yml		SAM.yml
ncsnv2.yml		ncsnv2.yml
transformers.yml		transformers.yml
uvcgan2.yaml		uvcgan2.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

research_utils

envrionment files

ncsnv2.yml

SAM.yml

transformers.yml

uvcgan2.yml

conda environments

switching from conda (not free) to miniconda

Jupyter Notebook:

Using Jupyter Notebook on server (MobaXTerm)

Adding envrionments to Jupyter notebooks

changing the default start location of jupyter notebooks

Tmux (running processes in the server without disconnecting)

Cleaning up disk space:

check disk space to a directory:

remove directory recursively:

cleaning up cache

move conda envrionments to a different directory

Copying to and from different servers

if the source or destination is a local:

Copying to and from GCP

Memory issues:

Memory not being released (cuda running out of memory error)

pip running out of memory (can't install because of OS memory)

Random Errors and quality of life stuff

RuntimeError: main thread is not in main loop

PIL image mode lookup

Plus Minus sign (±)

Git issues:

commited large files to git repo and need to remove (blob.txt as an example):

About

Uh oh!

Releases

Packages

Languages

ramon349/research_utils

Folders and files

Latest commit

History

Repository files navigation

research_utils

envrionment files

ncsnv2.yml

SAM.yml

transformers.yml

uvcgan2.yml

conda environments

switching from conda (not free) to miniconda

Jupyter Notebook:

Using Jupyter Notebook on server (MobaXTerm)

Adding envrionments to Jupyter notebooks

changing the default start location of jupyter notebooks

Tmux (running processes in the server without disconnecting)

Cleaning up disk space:

check disk space to a directory:

remove directory recursively:

cleaning up cache

move conda envrionments to a different directory

Copying to and from different servers

if the source or destination is a local:

Copying to and from GCP

Memory issues:

Memory not being released (cuda running out of memory error)

pip running out of memory (can't install because of OS memory)

Random Errors and quality of life stuff

RuntimeError: main thread is not in main loop

PIL image mode lookup

Plus Minus sign (±)

Git issues:

commited large files to git repo and need to remove (blob.txt as an example):

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages