Releases: cellgeni/STARsolo
Major fixes of processing logic and support of extra platforms
During the intense use of these scripts in the past year, several bugs related to automatic parameter setting in STARsolo were discovered. Release 3.1 aims to fix all of those. Namely,
- problems that arose when read 1 is sequenced to a length different than what's expected for a given BC + UMI. This can cause serious batch effects for samples processed assuming incorrect UMI length. New logic avoids this, resetting the UMI parameter whenever necessary.
- problems in detection of read strand-specificity (e.g. 3' vs. 5' experiments) in
starsolo_10x_auto.sh
. Datasets with low mapping rates caused many issues here; thus, a more conservative (and, hopefully, robust) approach was chosen. - change in logic of how 200k test reads are selected from the fastq files. Current release takes top 1M reads from each fastq file, and then subsamples 200k out of those using
seqtk
. This should be faster than previous approaches, and also more robust to some corner cases (e.g. subsampling from a particularly bad fastq file). - numerous minor fixes and updates.
STAR v2.7.10a overhaul
STARsolo wrapper scripts v3.0 - made uniform, fixed bugs, updated README. Multimapper counting (EM) is now on by default, but normal Cell Ranger output is still produced; EM matrix is available as an extra file in /raw
directories.
seqtk
and samtools
are now required for a 10X-auto script by default.
Docker and Singularity images will be added soon.
Feb 2022 onwards
Automatic inference of 10x chemistry, single/paired end reads, strand specificity, etc. Separate scripts for processing of STRT-seq, Smart-seq2, Drop-seq, and inDrops.
version 1.0 used for most processing in Sep 2021 - Jan 2022
Version 1.0 used for most processing in Sep 2021 - Jan 2022. Require knowledge of the 10x chemistry.