You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is eQTL scan results for some of the genes and variants in the TOPMed set. Starting out using the Whole Blood tissue sample, there is SuSie fine mapping data (488k variants) and conditional regression data (108k varints) which have ~50k variants when collated.
Cis-eQTL Scan Data
SuSie data columns:
Column descriptions:
phenotype_id: gene identifier
variant_id: genetic variant, in format {chr}_{pos}_{ref}_{alt}
pip: SuSiE PIP (essentially, the probability the variant is a causal one for this eQTL signal)
af: frequency of the alt allele
cs_id: Credible set ID. cs_id + phenotype_id together uniquely identify a credible set. A credible set containing more than one genetic variant will span more than one line.
Conditional regression data columns:
Significant independent eQTL signals for each gene (generated using forward-backward linear regression)
Column descriptions:
phenotype_id: gene identifier
num_var: number of genetic variants tested for association with this gene's expression
beta_shape1: First beta distribution parameter used when computing beta-approximated p-value (see FastQTL publication [1]). When there are multiple independent eQTL signals for a gene, this is computed during the backward step, i.e. controlling for each of the gene's other independent eQTL signals.
beta_shape2: Second beta distribution parameter used when computing beta-approximated p-value (see FastQTL publication [1]). When there are multiple independent eQTL signals for a gene, this is computed during the backward step, i.e. controlling for each of the gene's other independent eQTL signals.
true_df: estimated true degrees of freedom (used when computing beta-approximated p-value; see FastQTL publication [1]). When there are multiple independent eQTL signals for a gene, this is computed during the backward step, i.e. controlling for each of the gene's other independent eQTL signals.
pval_true_df: p-value calculated using true_df (used when computing beta-approximated p-value; see FastQTL publication [1]). When there are multiple independent eQTL signals for a gene, this is computed during the backward step, i.e. controlling for each of the gene's other independent eQTL signals.
variant_id: genetic variant, in format {variant_chromosome}{variant_position}{variant_ref_allele}_{variant_alt_allele}
tss_distance: (signed) distance between the gene TSS and the genetic variant
ma_samples: number of samples having the minor allele
ma_count: minor allele count
af: frequency of the alt allele
pval_nominal: nominal p-value for association between the gene expression and genetic variant allele dosage. Note that due to underflow, some p-values may be equal to 0. When there are multiple independent eQTL signals for a gene, this is computed during the backward step, i.e. controlling for each of the gene's other independent eQTL signals.
slope: linear regression estimated slope for the allele dosage term when modeling association between gene expression and genetic variant. The effect allele is always the alt allele (which can be inferred from the variant_id as described above), such that in the case of a significant association between gene expression and genetic variant, slope greater than 0 indicates that the alt allele favors higher expression of the gene. When there are multiple independent eQTL signals for a gene, this is computed during the backward step, i.e. controlling for each of the gene's other independent eQTL signals.
slope_se: standard error of the estimated slope
pval_perm: empirical p-value for association between the gene expression and genetic variant, adjusted for multiple testing at the gene level (i.e. testing many variants against this one gene; NOT genome-wide corrected) using permutations (see FastQTL publication [1]). When there are multiple independent eQTL signals for a gene, this is computed during the backward step, i.e. controlling for each of the gene's other independent eQTL signals.
pval_beta: p-value for association between the gene expression and genetic variant, adjusted for multiple testing at the gene level (i.e. testing many variants against this one gene; NOT genome-wide corrected) using the fitted beta distribution (see FastQTL publication [1]). Note that due to underflow, some p-values may be equal to 0. When there are multiple independent eQTL signals for a gene, this is computed during the backward step, i.e. controlling for each of the gene's other independent eQTL signals.
rank: rank of the association, based on the order in which the association was discovered during the forward step
Design Directions
Questions
What questions should the UI answer?
Does this gene have any eQTLs?
How many eQTLs does this gene have?
How many eQTLs are in this region?
Which genes in this region have eQTL data?
What is the magnitude of the effect of an eQTL on a gene?
How distant is the eQTL from the gene it is affecting the expression of?
What tissue samples have eQTLs in this gene or region?
What tissue sample is an eQTL from?
Which gene in this region has the most eQTLs?
Table
Re-using the tabulator-tables dependency that is currently used to display a table of variants in a region or gene, design new or adapt the variant table for eQTLs.
Separate table?
Alternate table swapped for variant table?
Additional columns to variant table?
Figure
Visual representation of the genomic region of the gene(s) and associated eQTLs.
Background
There is eQTL scan results for some of the genes and variants in the TOPMed set. Starting out using the Whole Blood tissue sample, there is SuSie fine mapping data (488k variants) and conditional regression data (108k varints) which have ~50k variants when collated.
Cis-eQTL Scan Data
SuSie data columns:
{chr}_{pos}_{ref}_{alt}
Conditional regression data columns:
Significant independent eQTL signals for each gene (generated using forward-backward linear regression)
Design Directions
Questions
What questions should the UI answer?
Table
Re-using the
tabulator-tables
dependency that is currently used to display a table of variants in a region or gene, design new or adapt the variant table for eQTLs.Figure
Visual representation of the genomic region of the gene(s) and associated eQTLs.
Completion Criteria
The text was updated successfully, but these errors were encountered: