cnvEQTL: test for linearity / correlation #23

lgeistlinger · 2020-09-10T14:18:12Z

@DarioS moving over from email to issue tracker:

If there are more than two copy number levels for a gene, CNVRanger does an ANOVA-like test, which is a test for differences in means. Is there a way to treat the copy number as a numeric measurement and test for linearity of response? I have copy number estimates from whole genome sequencing and a gene may have copy numbers such as two, four and nine in different samples and I want to know if increasing copy number is associated with increased abundance, rather than just different abundance. I am not interested in analysing distant copy numbers, but the copy number of the gene itself.

lgeistlinger · 2020-09-10T14:25:44Z

I think you mean that you would like to test for correlation between copy number and expression as eg here or here.

That is currently not supported but is a nice feature request, which I'll add.

CNVRanger carries out CNV-centric EQTL testing. Given that you are interested in local (= gene dosage) effects, the approach would be to use eg GenomicRanges::subsetByOverlaps to subset to CNVs that directly overlap with genes, and then use CNVRanger::cnvEQTL with argument window = 0.

DarioS · 2020-09-11T00:00:04Z

A challenge will be that some samples will me mostly diploid, others are mostly triploid and others are mostly tetraploid. The standard median RNA-seq scaling will scale each sample to be equal for the most common copy number state to the reference sample (median expression gene might be copy number 2 in sample X but copy number 4 in sample Y). Perhaps the comparison needs to be between scaled RNA-seq values and relative gene copy numbers (each gene's copy number - the sample's ploidy).

lgeistlinger · 2020-09-11T00:07:43Z

That sounds to me like needing a purity/ploidy-aware copy number caller such as PureCN to produce absolute integer copy numbers that could be correlated with gene expression values (those could be simply logCPMs I think) across samples.

DarioS · 2020-09-11T02:00:07Z

Yes, I use PURPLE (not yet published) for absolute gene copy number and sample ploidy estimation. I'm hesitant to use CPM because it is relative to all other genes in the sample and can change merely because of other genes in the same sample changing.

lgeistlinger added the enhancement label Sep 10, 2020

lgeistlinger self-assigned this Apr 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cnvEQTL: test for linearity / correlation #23

cnvEQTL: test for linearity / correlation #23

lgeistlinger commented Sep 10, 2020

lgeistlinger commented Sep 10, 2020

DarioS commented Sep 11, 2020

lgeistlinger commented Sep 11, 2020 •

edited

Loading

DarioS commented Sep 11, 2020

cnvEQTL: test for linearity / correlation #23

cnvEQTL: test for linearity / correlation #23

Comments

lgeistlinger commented Sep 10, 2020

lgeistlinger commented Sep 10, 2020

DarioS commented Sep 11, 2020

lgeistlinger commented Sep 11, 2020 • edited Loading

DarioS commented Sep 11, 2020

lgeistlinger commented Sep 11, 2020 •

edited

Loading