Skip to content

mode1990/scROCK

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scROCK

scROCK (single-cell Refinement Of Cluster Knitting) is an algorithm for correcting cluster labels for scRNA-seq data, based on Xinchuan Zeng and Tony R. Martinez. 2001. An algorithm for correcting mislabeled data. Intell. Data Anal. 5, 6 (December 2001), 491–502..

Installation

pip install https://github.com/dos257/scROCK/tarball/master

For private repository use: pip install git+https://{token}@github.com/dos257/scROCK.git

Use keys --upgrade --no-deps --force-reinstall for forced update from git repository.

Usage

If X is log1p-preprocessed numpy.array of shape (n_samples, n_genes) and y is integer clustering labels (from Leiden algorithm),

from scrock import scrock
y_fixed = scrock(X, y)

Docker

For convenience, scrock supports simplified command line:

python3 -m scrock refine_clusters data.h5ad

or find_doublets instead of refine_clusters

For refine_clusters task, from file (here data.h5ad) scrock tries to read (in that order) .obs["seurat_clusters"], .obs["leiden"], .obs["cell_line"].

Also, this command line could be run inside Docker.

Build Docker image:

docker build --tag scrock-image .

Run Docker image passing host path with input file:

docker run --name scrock --volume /host-path-to-input/data:/data scrock-image refine_clusters /data/sce_sc_10x_5cl_qc.h5ad

Output will be written to stdout.

Known issues

If code consumes high CPU percent (but still works slowly), try:

torch.set_num_threads(1)

Torch imperfect CPU parallelization spends most of the time in thread synchronization and slows down all process.

Acknowledgements

scROCK was developed under the supervision of Dr. Vikas Bansal (Head of Biomedical Data Science Group at German Center for Neurodegenerative Diseases (DZNE), Tübingen).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.7%
  • Dockerfile 1.2%
  • Makefile 1.1%