Skip to content

XSLiuLab/TCfinder

Repository files navigation

TCfinder

TCfinder is the tool to distinguish tumor cells from normal cells in single-cell data from the perspective of gene pathway expression quantification. A pathway usually contains multiple genes, which makes TCfinder more applicable because it overcomes the single-cell data sparsity problem faced by traditional methods. The successful construction of TCfinder also suggests the applicability of gene pathway expression quantification in the annotation of other cell types in scRNA-seq.

Workflow

Image text

Installation and use of TCfinder package.

TCfinder, as an R package, can be downloaded and used via Github. TCfinder relies on several R packages, and these dependencies include:

R (>= 3.5.0);

dplyr (>= 1.1.0);

reticulate (>= 1.2.6);

Matrix;

fs;

Install

devtools::install_github("XSLiuLab/TCfinder")

TCfinder contains three functions, which respectively standardize the raw counts of single cells, score pathways, and predict tumor cells and normal cells.

Data normalization

The input data needs to be a sparse matrix or data.frame data whose row name is gene name and column name is sample name.

If the single-cell sequencing method used is smart-seq2, method = "smart-seq2" is required, and needed to select genome = "hg19" or "hg38". For other single-cell sequencing methods, this parameter does not need to be filled in.

library(TCfinder)
result1 <- data_normalized(expr_data = expr_data,method = "method",genome = "hg38")

Example:

The row name is gene symbol, and the column name is barcode of the sample.

AAACCTGCACATCCGG ... AAACGGGGTTGAACTC AAACGGGGTTGTCGCG
FAM138A 0 ... 0 1
OR4F5 8 ... 20 15
... ... ... ... ...
FAM87B 1 ... 0 1

Pathway score

The path score is calculated using the built-in 213 pathways according to the formula in workflow.

The output of data_normalized() can be directly used as the input of pathway_score(). If the matrix is not normalized, "normalized = FALSE" is needed to set

result2 <- pathway_score(expr_data = result1, normalized = T)

result2: pathway score

hsa00010 hsa00190 ... hsa00270
AAACCTGCACATCCGG 0.3401667 0.9679245 ... 0.2091803
AAACGGGGTTGAACTC 0.5657879 1.6702925 ... 0.4492787
... ... ... ... ...
AAACGGGGTTGTCGCG 0.3202879 1.4834434 ... 0.4590984

Prediction of cell type (tumor cell or normal cell)

The prediction model is developed based on deep learning in python, so some python environments and module installations need to be configured before running the prediction.

Python environment and module installation

# Create a new environment
conda create -n new_env python=3.8
# Activate the new environment
conda activate new_env
# Install required modules
conda install tensorflow==2.3.0
conda install pandas==1.0.5
conda install numpy==1.18.5
# View conda environment information
conda env list # Copy the address of the new conda environment, which will be used later

Predict cell

The prediction process needs to call a python script, so the R package 'reticulate' is required. The input data is the pathway score result obtained by running the pathway_score() function

install.packages("reticulate")
library(reticulate)
# Use the use_python() function to specify the version, here we use the python just created and configured above
reticulate::use_python("XXX/XXX/XXX/anaconda3/envs/new_env/bin/python")
# View specified environment information
reticulate::py_config()
# Predict
predict_result <- predict_cell(path_score = result2)

predict_result

value cell_type barcode
1 0.9996183 normal AAACCTGCACATCCGG
2 0.9989167 normal AAACGGGGTTGAACTC
3 0.0001887589 tumor AAACGGGGTTGTCGCG
... ... ... ...

Citation

Chenxu Wu, Wei Ning, Tao Wu, Jing Chen, Huizi Yao, Ziyu Tao, Xiangyu Zhao, Kaixuan Diao, Jinyu Wang, Weiliang Wang, Xinxing Li, Qianqian Song, Xue-Song Liu. 2024. TCfinder: Robust tumor cell discriminationin scRNA-seq based on gene pathway activity. iMetaOmics 1: e22. https://doi.org/10.1002/imo2.22

Contributors

TCfinder was developed by Chenxu Wu. Please contact Chenxu Wu: [email protected] for any questions or suggestions. Thank you for your use and feedback.


Cancer Biology Group @ShanghaiTech

Research group led by Xue-Song Liu in ShanghaiTech University

About

No description, website, or topics provided.

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages