You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've run DMLtest to test differential methylation between two samples. Since there are no replicates I used smoothing=TRUE. Some of the results don't make sense to me. Below is an example.
Below is a region that was called as a DMR from callDMR. There are 16 CpGs in this region (the 16 bars in the first track). Only the CpG at 30111445 has coverage in both samples (as shown in the top two tracks). And this CpG is fully unmethylated in sample 1 (all 7 reads are unmethylated) and fully methylated in sample 2 (all 5 reads are methylated).
Below is the output from DMLtest. All 16 CpGs have identical mu1 and mu2 values. The CpG at 30111445 (the only CpG that is covered in both samples) has a p-value = 2.252324e-03, and all the other CpGs got very low p-values, even though they don't have read coverage in the 2nd sample at all. So by using the default p-value cutoff, the true CpG that is differentially methylated is discarded, while all the remaining 15 CpGs that do not have coverage in one sample are kept as DMLs.
I'm really confused by these results:
Why does DSS do test on CpGs that have no coverage at all in one sample and how does it make up the values from nothing?
The only CpG that is covered in both samples in this region has 0% methylation in sample 1 and 100% methylated in sample2. Why does DSS change this to 0.03564888 versus 0.7295291?
I guess both actions are a result of smoothing, but I think this twists the real data too much. Is there a way to avoid this behavior? Should I first filter out the CpGs that have coverage only from one sample and then run DSS? But I've read that coverage filtering is not recommended because it's taken care of in DSS.
I've run
DMLtest
to test differential methylation between two samples. Since there are no replicates I usedsmoothing=TRUE
. Some of the results don't make sense to me. Below is an example.Below is a region that was called as a DMR from
callDMR
. There are 16 CpGs in this region (the 16 bars in the first track). Only the CpG at 30111445 has coverage in both samples (as shown in the top two tracks). And this CpG is fully unmethylated in sample 1 (all 7 reads are unmethylated) and fully methylated in sample 2 (all 5 reads are methylated).Below is the output from
DMLtest
. All 16 CpGs have identicalmu1
andmu2
values. The CpG at 30111445 (the only CpG that is covered in both samples) has a p-value = 2.252324e-03, and all the other CpGs got very low p-values, even though they don't have read coverage in the 2nd sample at all. So by using the default p-value cutoff, the true CpG that is differentially methylated is discarded, while all the remaining 15 CpGs that do not have coverage in one sample are kept as DMLs.I'm really confused by these results:
I guess both actions are a result of smoothing, but I think this twists the real data too much. Is there a way to avoid this behavior? Should I first filter out the CpGs that have coverage only from one sample and then run DSS? But I've read that coverage filtering is not recommended because it's taken care of in DSS.
Thanks in advance for any explanation & advice.
Code:
The text was updated successfully, but these errors were encountered: