You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
scReader: Prompting Large Language Models to Interpret scRNA-seq Data
Method:
scReader leverages large language models (LLMs) to interpret single-cell RNA sequencing (scRNA-seq) data by generating gene embeddings based on NCBI functional descriptions (e.g., gene type, organism, and expression location). The pipeline begins by ranking and selecting the top 2048 highly variable genes (HVG). Each selected gene is assigned an embedding, derived by inputting its NCBI functional description into GPT-3.5. These gene embeddings are concatenated to form a cell embedding. This cell embedding, combined with an instruction embedding (generated from task-specific prompts), is fed into a transformer-based LLM. The model processes this input and directs the class token to downstream tasks such as cell type classification.
Findings:
Experimental results demonstrate that integrating LLMs into single-cell omics analysis pipelines significantly improves the interpretation and classification of cell types. Furthermore, this method shows potential for applications in multi-omics integration and rare cell type identification, offering valuable insights for precision medicine and developmental biology.
Dataset:
The experiments were conducted on two in-house datasets.
Reference: @Article{li2024screader,
title={scReader: Prompting Large Language Models to Interpret scRNA-seq Data},
author={Li, Cong and Long, Qingqing and Zhou, Yuanchun and Xiao, Meng},
journal={arXiv preprint arXiv:2412.18156},
year={2024}
}
https://arxiv.org/abs/2412.18156
The text was updated successfully, but these errors were encountered: