-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cnvEQTL: test for linearity / correlation #23
Comments
I think you mean that you would like to test for correlation between copy number and expression as eg here or here. That is currently not supported but is a nice feature request, which I'll add. CNVRanger carries out CNV-centric EQTL testing. Given that you are interested in local (= gene dosage) effects, the approach would be to use eg |
A challenge will be that some samples will me mostly diploid, others are mostly triploid and others are mostly tetraploid. The standard median RNA-seq scaling will scale each sample to be equal for the most common copy number state to the reference sample (median expression gene might be copy number 2 in sample X but copy number 4 in sample Y). Perhaps the comparison needs to be between scaled RNA-seq values and relative gene copy numbers (each gene's copy number - the sample's ploidy). |
That sounds to me like needing a purity/ploidy-aware copy number caller such as PureCN to produce absolute integer copy numbers that could be correlated with gene expression values (those could be simply logCPMs I think) across samples. |
Yes, I use PURPLE (not yet published) for absolute gene copy number and sample ploidy estimation. I'm hesitant to use CPM because it is relative to all other genes in the sample and can change merely because of other genes in the same sample changing. |
@DarioS moving over from email to issue tracker:
If there are more than two copy number levels for a gene, CNVRanger does an ANOVA-like test, which is a test for differences in means. Is there a way to treat the copy number as a numeric measurement and test for linearity of response? I have copy number estimates from whole genome sequencing and a gene may have copy numbers such as two, four and nine in different samples and I want to know if increasing copy number is associated with increased abundance, rather than just different abundance. I am not interested in analysing distant copy numbers, but the copy number of the gene itself.
The text was updated successfully, but these errors were encountered: