Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cnvEQTL: test for linearity / correlation #23

Open
lgeistlinger opened this issue Sep 10, 2020 · 4 comments
Open

cnvEQTL: test for linearity / correlation #23

lgeistlinger opened this issue Sep 10, 2020 · 4 comments
Assignees

Comments

@lgeistlinger
Copy link
Collaborator

@DarioS moving over from email to issue tracker:

If there are more than two copy number levels for a gene, CNVRanger does an ANOVA-like test, which is a test for differences in means. Is there a way to treat the copy number as a numeric measurement and test for linearity of response? I have copy number estimates from whole genome sequencing and a gene may have copy numbers such as two, four and nine in different samples and I want to know if increasing copy number is associated with increased abundance, rather than just different abundance. I am not interested in analysing distant copy numbers, but the copy number of the gene itself.

@lgeistlinger
Copy link
Collaborator Author

I think you mean that you would like to test for correlation between copy number and expression as eg here or here.

That is currently not supported but is a nice feature request, which I'll add.

CNVRanger carries out CNV-centric EQTL testing. Given that you are interested in local (= gene dosage) effects, the approach would be to use eg GenomicRanges::subsetByOverlaps to subset to CNVs that directly overlap with genes, and then use CNVRanger::cnvEQTL with argument window = 0.

@DarioS
Copy link

DarioS commented Sep 11, 2020

A challenge will be that some samples will me mostly diploid, others are mostly triploid and others are mostly tetraploid. The standard median RNA-seq scaling will scale each sample to be equal for the most common copy number state to the reference sample (median expression gene might be copy number 2 in sample X but copy number 4 in sample Y). Perhaps the comparison needs to be between scaled RNA-seq values and relative gene copy numbers (each gene's copy number - the sample's ploidy).

@lgeistlinger
Copy link
Collaborator Author

lgeistlinger commented Sep 11, 2020

That sounds to me like needing a purity/ploidy-aware copy number caller such as PureCN to produce absolute integer copy numbers that could be correlated with gene expression values (those could be simply logCPMs I think) across samples.

@DarioS
Copy link

DarioS commented Sep 11, 2020

Yes, I use PURPLE (not yet published) for absolute gene copy number and sample ploidy estimation. I'm hesitant to use CPM because it is relative to all other genes in the sample and can change merely because of other genes in the same sample changing.

@lgeistlinger lgeistlinger self-assigned this Apr 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants