You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello to anyone still minding this hub! I appreciate any help. I hope the developers of this tool find some more financial support in the future as well, as this is an awesome project.
My lab does tumor modeling in zebrafish. I have several different tumors from different fish and I am trying to infer any CNV variants across them. I have performed seurat integration (though I am using the raw counts for this) separately. I designated all cells with at least 1 transcript of GFP as my observation group, and everything else as control. I've tested on running one tumor separately and that does work, but I was hoping to pool them. The combined set isn't that large at 25k cells.
I keep crashing at step 18. I can see my system resources maxing out (64 gigs of ram), followed by the swap mem. It'll then crash. Is there anyway around this? Will it work if I run on a HPC node with more memory? Despite dropping the leiden_resolution parameter quite a bit, it looks like its still trying to make a ton of subclusters... I'm just looking for broad and obvious changes.
#gfp reference by pooled sample
infercnv_obj = CreateInfercnvObject(
raw_counts_matrix=all.counts,
annotations_file=cell.idents,
delim="\t",
gene_order_file=geneorderfile,
ref_group_names=c("bard1_count_GFP_NEG","brca2july2024_count_GFP_NEG", "brca2older_count_GFP_NEG","ddr_wt_count_GFP_NEG","palb2_count_GFP_NEG"))
infercnv_obj_default = infercnv::run(
infercnv_obj,
cutoff=0.1, # cutoff=1 works well for Smart-seq2, and cutoff=0.1 works well for 10x Genomics
out_dir=outdir,
cluster_by_groups=TRUE,
plot_steps=FALSE,
denoise=TRUE,
HMM=TRUE,
no_prelim_plot=TRUE,
leiden_resolution = 0.01,
num_threads = 12,
png_res=180,
debug = TRUE,
BayesMaxPNormal = 0.2,
per_chr_hmm_subclusters = FALSE,
)
The text was updated successfully, but these errors were encountered:
For anyone running into similar issues, I naively didn't realize that the memory usage scales with the number of threads dedicated to parallel processing of the set. Reducing num_threads (in my case leaving it at the default instead of trying 12) seems to allow it to run for individual samples at least.
I can run this on an HPC and just keep bumping up the memory allocation if it crashes. (Maybe try 250 GB for 25k cells?) I downloaded the singularity image and ran the R script exactly as it is provided on the installation page and it gave me no issues. Hope that helps!
Hello to anyone still minding this hub! I appreciate any help. I hope the developers of this tool find some more financial support in the future as well, as this is an awesome project.
My lab does tumor modeling in zebrafish. I have several different tumors from different fish and I am trying to infer any CNV variants across them. I have performed seurat integration (though I am using the raw counts for this) separately. I designated all cells with at least 1 transcript of GFP as my observation group, and everything else as control. I've tested on running one tumor separately and that does work, but I was hoping to pool them. The combined set isn't that large at 25k cells.
I keep crashing at step 18. I can see my system resources maxing out (64 gigs of ram), followed by the swap mem. It'll then crash. Is there anyway around this? Will it work if I run on a HPC node with more memory? Despite dropping the leiden_resolution parameter quite a bit, it looks like its still trying to make a ton of subclusters... I'm just looking for broad and obvious changes.
The text was updated successfully, but these errors were encountered: