Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure ClinVar is filtering out significance terms #1075

Open
AlistairNWard opened this issue Jun 3, 2024 · 5 comments
Open

Ensure ClinVar is filtering out significance terms #1075

AlistairNWard opened this issue Jun 3, 2024 · 5 comments
Assignees
Milestone

Comments

@AlistairNWard
Copy link
Member

If the ClinVar significance contains a term Pathogenic,drug_response, does gene.iobio remove the non-significance terms (drug_response in this case) so that the displayed term is just Pathogenic? If not, the variant will not be flagged even though it is a pathogenic variant

@tonydisera
Copy link
Collaborator

In cases where more than one ClinVar significance is designated on a variant, gene.iobio will parse the multiple terms, and apply the filter criteria to the most pathogenic term. So, in your example, the variant will pass the filter ClinVar = 'pathogenic' because that term ranks higher than 'drug response'.

Another common scenario is a ClinVar variant with 'Benign/Likely Benign' dual designation. Here, 'Likely Benign' ranks higher, so it will be evaluated when the filter is applied. So this variant will pass a custom filter ClinVar = 'Likely Benign'. However, this variant will NOT pass the filter ClinVar = 'Benign'.

It could be argued that the filter logic should evaluate each term, passing the variant if ANY term matches. For example, as @AlistairNWard points out, 'drug response' is often coupled with another term. In these cases, the current logic will miss those variants with dual designations of 'drug response' + 'pathogenic' (or 'likely pathogenic').

@tonydisera tonydisera added this to the 4.12 milestone Jun 17, 2024
@AlistairNWard
Copy link
Member Author

We shouldn't be looking at any of the other terms (drug_response, risk allele etc) unless we explicitly want them in a different spot. These don't count as significance, the only terms that are a significance are:

Pathogenic
Likely_pathogenic
Uncertain significance
Conflicting interpretations of pathogenicity
Likely_benign
Benign

I think Pathogenic/Likely_pathogenic is also an allowed term and doesn't need to be broken up

@tonydisera tonydisera modified the milestones: 5.1, 4.12 Aug 13, 2024
@tonydisera
Copy link
Collaborator

tonydisera commented Aug 14, 2024

Thank you, @AlistairNWard, for identifying the clinical significance terms that we should be filtering on. Right now, the terms we can filter on are:
Pathogenic
Likely pathogenic
Uncertain significance
Conflicting data
Other
Benign
Likely benign

  1. We will remove 'Other' from the dropdown on the Filter dialog.
  2. We will rename the term 'Conflicting data' to 'Conflicting classifications of pathogenicity' in the dropdown.
  • Although not explicitly stated in the ClinVar documentation, it looks like the term conflicting data from submitters has the same meaning as Conflicting interpretations of pathogenicity . Looking at the latest vcf, the term used is conflicting_classifications_of_pathogenicity.
  1. We will simplify the filter logic by removing the term's rank from consideration. Instead, we will split apart multi-term designations into separate terms. If any of the separate terms match the filter criteria, the variant will pass the filter. For example, if the filter is set to ClinVar = 'Benign', a variant with the dual ClinVar designation of 'Benign/Likely benign' will pass the filter. Accordingly, if the filter is set to ClinVar = 'Uncertain significance', a variant with the dual designation of 'Uncertain significance/Conflicting data' will pass the filter.

Here is an example of multiple CLINSIG terms for an RAI1 variant (https://www.ncbi.nlm.nih.gov/clinvar/variation/1560497/):

The VCF INFO fields look like this:

ALLELEID=1575988;CLNDISDB=.|MedGen:C3661900;CLNDN=RAI1-related_disorder|not_provided;CLNHGVS=NC_000017.11:g.17793021T>A;CLNREVSTAT=criteria_provided,_conflicting_classifications;CLNSIG=Conflicting_classifications_of_pathogenicity;CLNSIGCONF=Uncertain_significance(1)|Likely_benign(1);CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=RAI1:10743;MC=SO:0001583|missense_variant;ORIGIN=1;RS=891764320

IMPORTANT! This may break our current code. Notice that CLNSIGCONF has the terms that we filter on, not CLNSIG.

It looks like this new INFO field CLNSIGCONF is only used for conflicting classifications of pathogenicity. Here are the INFO fields for a ClinVar variant with a single CLINSIG designation https://www.ncbi.nlm.nih.gov/clinvar/variation/1377204/:

ALLELEID=1394819;CLNDISDB=MedGen:C3661900;CLNDN=not_provided;CLNHGVS=NC_000017.11:g.17792953A>G;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Likely_benign;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=RAI1:10743;MC=SO:0001583|missense_variant;ORIGIN=1;RS=2143001428

@tonydisera
Copy link
Collaborator

tonydisera commented Aug 14, 2024

It looks like the CLINSIG term will be replaced with three separate classifications. https://github.com/ncbi/clinvar/blob/master/ClassificationOnClinVar.md. I don't see these separate classifications in the latest ClinVar vcf. I will create a separate issue (#1093)

@AlistairNWard
Copy link
Member Author

@tonydisera, conflicting data from submitters is a term specifically used when a consortium makes a single submission to ClinVar, but the consortium has conflicting interpretations. Conflicting interpretations of pathogenicity is when different submitters submit the same variant but have specific conflicts. If one lab things a variant is Pathogenic and another Likely Pathogenic, the record will appear as Pathogenic/Likely_pathogenic, but if one lab has any type of pathogenic term and another lab has benign or uncertain, the variant will be listed as conflicting interpretations of pathogenicity.

This is specifically information about RCV and VCV which are accession ids for variants with conflicting submissions and I don't know that this has any effect on the vcf files.

@tonydisera tonydisera self-assigned this Aug 27, 2024
@tonydisera tonydisera modified the milestones: 4.11.2, 4.11.3 Aug 28, 2024
@tonydisera tonydisera modified the milestones: 4.11.3, 4.11.4 Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants