Link: https://doi.org/10.1101/2025.11.06.686964
Cell type annotation remains a critical bottleneck, with current methods often inaccurate and requiring extensive manual validation, particularly in disease contexts. While large language models (LLMs) show promise, they can be unreliable due to hallucinations. We developed CyteType, a multi-agent framework that generates competing hypotheses grounded in full expression data and study context, validates against external databases, and iteratively self-evaluates. Comprehensive benchmarking demonstrates that CyteType substantially outperforms reference-based and LLM-based methods, with self-generated confidence scores reliably identifying trustworthy annotations. CyteType transforms cell type annotation from label assignment into evidence-grounded biological discovery.
- Python (AnnData compatible): https://github.com/NygenAnalytics/CyteType
- R (Seurat compatible): https://github.com/NygenAnalytics/CyteTypeR
This repository contains the code and data supporting the manuscript, organized into two main sections:
- 01-benchmark-and-supp: Contains benchmarking datasets, supplementary materials, and notebooks for data retrieval and initial analysis.
- 02-cyteonto-and-results: Contains the CyteOnto analysis, results, and code for generating the manuscript figures.
bioRxiv: Gautam Ahuja, Alex Antill, Yi Su, Giovanni Marco Dall’Olio, Sukhitha Basnayake, Göran Karlsson, Parashar Dhapola. "Multi-agent AI enables evidence-based cell annotation in single-cell transcriptomics." bioRxiv 2025.11.06.686964; doi: https://doi.org/10.1101/2025.11.06.686964