Maxing out memory, crashing RStudio and Computer at Step 18 #682

alexwskh · 2024-12-31T18:02:15Z

Hello to anyone still minding this hub! I appreciate any help. I hope the developers of this tool find some more financial support in the future as well, as this is an awesome project.

My lab does tumor modeling in zebrafish. I have several different tumors from different fish and I am trying to infer any CNV variants across them. I have performed seurat integration (though I am using the raw counts for this) separately. I designated all cells with at least 1 transcript of GFP as my observation group, and everything else as control. I've tested on running one tumor separately and that does work, but I was hoping to pool them. The combined set isn't that large at 25k cells.

I keep crashing at step 18. I can see my system resources maxing out (64 gigs of ram), followed by the swap mem. It'll then crash. Is there anyway around this? Will it work if I run on a HPC node with more memory? Despite dropping the leiden_resolution parameter quite a bit, it looks like its still trying to make a ton of subclusters... I'm just looking for broad and obvious changes.

#gfp reference by pooled sample
infercnv_obj = CreateInfercnvObject(
  raw_counts_matrix=all.counts,
  annotations_file=cell.idents,
  delim="\t",
  gene_order_file=geneorderfile,
      ref_group_names=c("bard1_count_GFP_NEG","brca2july2024_count_GFP_NEG", "brca2older_count_GFP_NEG","ddr_wt_count_GFP_NEG","palb2_count_GFP_NEG"))

infercnv_obj_default = infercnv::run(
  infercnv_obj,
  cutoff=0.1, # cutoff=1 works well for Smart-seq2, and cutoff=0.1 works well for 10x Genomics
  out_dir=outdir,
  cluster_by_groups=TRUE, 
  plot_steps=FALSE,
  denoise=TRUE,
  HMM=TRUE,
  no_prelim_plot=TRUE,
  leiden_resolution = 0.01,
  num_threads = 12,
  png_res=180,
  debug = TRUE,
  BayesMaxPNormal = 0.2,
  per_chr_hmm_subclusters = FALSE,
    )

The text was updated successfully, but these errors were encountered:

alexwskh · 2025-01-01T21:33:18Z

For anyone running into similar issues, I naively didn't realize that the memory usage scales with the number of threads dedicated to parallel processing of the set. Reducing num_threads (in my case leaving it at the default instead of trying 12) seems to allow it to run for individual samples at least.

withermatt · 2025-01-14T16:26:12Z

Hi @alexwskh,

I can run this on an HPC and just keep bumping up the memory allocation if it crashes. (Maybe try 250 GB for 25k cells?) I downloaded the singularity image and ran the R script exactly as it is provided on the installation page and it gave me no issues. Hope that helps!

alexwskh · 2025-01-14T16:42:33Z

@withermatt Thanks for the reply,

Ok good to know that it can work if given enough memory. May revisit in the future then...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Maxing out memory, crashing RStudio and Computer at Step 18 #682

Maxing out memory, crashing RStudio and Computer at Step 18 #682

alexwskh commented Dec 31, 2024

alexwskh commented Jan 1, 2025

withermatt commented Jan 14, 2025

alexwskh commented Jan 14, 2025

Maxing out memory, crashing RStudio and Computer at Step 18 #682

Maxing out memory, crashing RStudio and Computer at Step 18 #682

Comments

alexwskh commented Dec 31, 2024

alexwskh commented Jan 1, 2025

withermatt commented Jan 14, 2025

alexwskh commented Jan 14, 2025