Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stringency in the NanoMethPhase paper #17

Closed
weishwu opened this issue May 22, 2023 · 2 comments
Closed

Stringency in the NanoMethPhase paper #17

weishwu opened this issue May 22, 2023 · 2 comments

Comments

@weishwu
Copy link

weishwu commented May 22, 2023

Hi @vahidAK. I followed the pipeline instructed in this git repo and got 30k DMRs (26 million bp in total) from my sample. The Nanopore sequencing depth for my sample is 27x. Read length N50 is 39kb and mean quality score is 14.7.

I noticed that in your paper (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-021-02283-5) you got ~2k DMRs. I understand that there are some parameters at the dma step that control stringency, especially the delta cutoff. But still it would be appreciated if you could let me know what parameters you were using in your paper.

I checked the CpG level differential methylation between the alleles in some known DMRs and imprinted genes and the signals in my data seems to make a lot of sense. It is that region calling thing that binarizes signals into segments that is always bothering me. I don't know how to set up a level of stringency that can achieve a sweet point between sensitivity and specificity. The sparsity nature of CpG methylation data makes this "segmentation" even harder than other types of data that has signal values in a basewise manner, like ChIP-Seq.

Thanks for any insights.

@vahidAK
Copy link
Owner

vahidAK commented May 23, 2023

Hi @weishwu ,

I think this is because of the version of DSS you are using not the parameters, as the parameters in the paper were the defaults of dma module which should give a similar number of DMRs with the latest release with default options. Some versions of DSS tend to give much more DMRs compare to others (It seems it happens when smoothing is true). For example, v2.46.0 tends to give much more DMRs compare to v2.36.0, read issue #7 for more information. Moreover, some samples generally have more allelic DMRs, for example, tumour samples.
You can also refine your DMR list afterward based on the "diff.methy" column (which is the difference of average methylations at DMR from both comparisons) and/or areaStat column.

Best,
Vahid

@weishwu
Copy link
Author

weishwu commented May 24, 2023

Thanks! Trying DSS 2.36.0 right now. Seems to be much much slower than 2.46.0. It has been staying at "Estimating dispersion for each CpG site, this will take a while ... 0%" for half a day. The 2.46.0 DSS finished within an hour.

Never mind. There was a glitch in my docker image. Fixed it and it worked fine. The DMRs were reduced by half using DSS 2.36.0.

@weishwu weishwu closed this as completed May 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants