Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when performing statistical test for DML with smoothing #13

Open
xiuru opened this issue Jul 14, 2020 · 9 comments
Open

Error when performing statistical test for DML with smoothing #13

xiuru opened this issue Jul 14, 2020 · 9 comments

Comments

@xiuru
Copy link

xiuru commented Jul 14, 2020

Hello,
I am using DSS to detect DML for WGBS data and got an error when performing statistical test for DML with smoothing.
My code:
dat1.1 = read.table("chr1-ZmBS-BS1-1-CpG.bismark.cov.tsv", header=TRUE)
dat1.2 = read.table("chr1-ZmBS-BS2-1-CpG.bismark.cov.tsv", header=TRUE)
dat2.1 = read.table("chr1-ZmMC-BS1-1-CpG.bismark.cov.tsv", header=TRUE)
dat2.2 = read.table("chr1-ZmMC-BS2-1-CpG.bismark.cov.tsv", header=TRUE)
BSobj = makeBSseqData( list(dat1.1, dat1.2, dat2.1, dat2.2),c("BS1","BS2", "MC1", "MC2") )
dmlTest.sm = DMLtest(BSobj, group1=c("BS1", "BS2"), group2=c("MC1", "MC2"),smoothing=TRUE)

But got an error like:
Smoothing ...
Estimating dispersion for each CpG site, this will take a while ...
|======================================================================| 100%

| | 0%Error in result[[njob]] <- value :
attempt to select less than one element in OneIndex
In addition: Warning message:
In parallel::mccollect(wait = FALSE, timeout = 1) :
1 parallel job did not deliver a result

When i try to test the first 20000 lines of those 4 files, DMLtest works fine with no error, it seems something wrong in my original files. Do you have any ideas on how to avoid this?

Thank you in advance!

@haowulab
Copy link
Owner

I can't tell from this. But there seems to be a warning msg in parallel computing part. Can you try to use single core? Do:

single = MulticoreParam(workers=1, progressbar=TRUE)
dmlTest.sm = DMLtest(BSobj, group1=c("BS1", "BS2"), group2=c("MC1", "MC2"),smoothing=TRUE, BPPARAM=single)

@xiuru
Copy link
Author

xiuru commented Jul 15, 2020

@haowulab Thanks for your suggestion. Single core works well for my data, maybe there are something wrong for my BiocParallel package. I will reinstall BiocParallel and try multi core for DMLtest.

Thanks!

@dlabuz
Copy link

dlabuz commented Jun 6, 2021

I can't tell from this. But there seems to be a warning msg in parallel computing part. Can you try to use single core? Do:

single = MulticoreParam(workers=1, progressbar=TRUE)
dmlTest.sm = DMLtest(BSobj, group1=c("BS1", "BS2"), group2=c("MC1", "MC2"),smoothing=TRUE, BPPARAM=single)

I have an issue running DMLtest with more than single core. The progress bar will just stay at 0% for an hour+ when using anything more than a single core. I'm working with human genome and single core takes several hours just comparing 2 samples, when in reality I want to compare several more samples. I've tried re-installing BiocParallel to no avail. I am running R v4.1.0 on ubuntu 20.04.2. Is this a problem specific to ubuntu parallelization? I saw this issue: Bioconductor/BiocParallel#106. I cannot figure out how to troubleshoot for DSS unfortunately. Any thoughts?

@haowulab
Copy link
Owner

haowulab commented Jun 7, 2021

I don't know. Can you can other BiocParallel codes in ubuntu?

@realzhang
Copy link

realzhang commented Sep 7, 2021

I can't tell from this. But there seems to be a warning msg in parallel computing part. Can you try to use single core? Do:
single = MulticoreParam(workers=1, progressbar=TRUE)
dmlTest.sm = DMLtest(BSobj, group1=c("BS1", "BS2"), group2=c("MC1", "MC2"),smoothing=TRUE, BPPARAM=single)

I have an issue running DMLtest with more than single core. The progress bar will just stay at 0% for an hour+ when using anything more than a single core. I'm working with human genome and single core takes several hours just comparing 2 samples, when in reality I want to compare several more samples. I've tried re-installing BiocParallel to no avail. I am running R v4.1.0 on ubuntu 20.04.2. Is this a problem specific to ubuntu parallelization? I saw this issue: Bioconductor/BiocParallel#106. I cannot figure out how to troubleshoot for DSS unfortunately. Any thoughts?

It seems that I have the same problem. The progress bar stay 0% for hours, even 50 or 80 threads are running. btw, I am using CentOS 7.

@haowulab
Copy link
Owner

haowulab commented Sep 7, 2021

I can't tell from this. But there seems to be a warning msg in parallel computing part. Can you try to use single core? Do:
single = MulticoreParam(workers=1, progressbar=TRUE)
dmlTest.sm = DMLtest(BSobj, group1=c("BS1", "BS2"), group2=c("MC1", "MC2"),smoothing=TRUE, BPPARAM=single)

I have an issue running DMLtest with more than single core. The progress bar will just stay at 0% for an hour+ when using anything more than a single core. I'm working with human genome and single core takes several hours just comparing 2 samples, when in reality I want to compare several more samples. I've tried re-installing BiocParallel to no avail. I am running R v4.1.0 on ubuntu 20.04.2. Is this a problem specific to ubuntu parallelization? I saw this issue: Bioconductor/BiocParallel#106. I cannot figure out how to troubleshoot for DSS unfortunately. Any thoughts?

It seems that I have the same problem. The progress bar stay 0% for hours, even 50 or 80 threads are running. btw, I am using CentOS 7.

I really can't tell. Are you using a desktop computer running ubuntu? There might be problems running biocparallel on a hpc cluster with a scheduler such as SGE. Can you run other codes using biocparallel?

@adRn-s
Copy link

adRn-s commented Mar 30, 2022

IDK if this is from upstream (BiocParallel) or not. Yet, I'm experiencing an issue that seems related to this. Using the example code from DMLtest help, I see that multiple core(s) is much slower than a single core on RStudio Server.

> mParam = MulticoreParam(workers=128, progressbar=TRUE)
> timestamp(); dmlTest1 <- DMLtest(BSobj, group1=c("C1", "C2"), group2=c("N1", "N2"), BPPARAM=mParam); timestamp()
##------ Wed Mar 30 10:13:40 2022 ------##
Estimating dispersion for each CpG site, this will take a while ...
  |=======================================================| 100%

  |===========================================| 100%

##------ Wed Mar 30 10:46:05 2022 ------##
> timestamp(); dmlTest1 <- DMLtest(BSobj, group1=c("C1", "C2"), group2=c("N1", "N2"), BPPARAM=single); timestamp()
##------ Wed Mar 30 10:51:37 2022 ------##
Estimating dispersion for each CpG site, this will take a while ...
  |===========================================| 100%
  |===========================================| 100%
##------ Wed Mar 30 10:51:46 2022 ------##

@haowulab
Copy link
Owner

haowulab commented Mar 30, 2022

To all users experiencing problems with parallel computing:

DSS used to use BiocParallel for parallel computing. However, some recent changes in BiocParallel makes it very slow. I asked on bioc website but nobody replied. You can see my post at https://support.bioconductor.org/p/9140528/ and try the codes there.

I modified DSS to use another package. You can see some description at http://www.bioconductor.org/packages/devel/bioc/vignettes/DSS/inst/doc/DSS.html#331_Parallel_computing_for_DMLDMR_detection_from_two-group_comparison.

The new package is available as “development” version at http://www.bioconductor.org/packages/devel/bioc/html/DSS.html. Bioc has only two releases every year, so the changes won’t appear in the “official” package maybe until summer. Anyway, you can install the devel version and try.

Hao

@llrs
Copy link

llrs commented Dec 22, 2022

I commented in the support thread which lead to opening an issue in BiocParallel: Bioconductor/BiocParallel#238
The behaviour might change but the solution using BiocParallel seems to be usingforce.GC=FALSE inside bplapply. Hopefully this will get fixed before the next release as current parallel solution doesn't work in windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants