Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manual curation effort workflow #25

Open
6 tasks
berntpopp opened this issue Nov 22, 2023 · 0 comments
Open
6 tasks

Manual curation effort workflow #25

berntpopp opened this issue Nov 22, 2023 · 0 comments
Assignees

Comments

@berntpopp
Copy link
Member

berntpopp commented Nov 22, 2023

Workflow:

  • if the respective gene/entity from our kidney list is in the ClinGen curated list apply this group and points after reviewing the entry
  • if the respective gene/entity from our kidney list is NOT in the ClinGen list
  1. review GenCC and OMIM entries
  2. decide to split or lump if regarding modified ClinGen criteria if applicable
  3. decide for a MONDO term (when lumping decide for only 1 "new" MONDO term and report the former ones / when splitting decide for >1 new term per Gene as disease entities)

--> use MONDO (not OMIM or Orphanet) as primary disease ontology in the manual curation effort
--> shorten the process if actionable: Genes associated with a single published disease entity should only be curated for that condition (i.e. lumped) unless there are indications to split specific phenotypic features of a syndrome or variable phenotype into separate curation(s)

Scoring logic:

  • if only screening publication then the category can't be more then Limited,
  • if there is a clinical description then the category can be Moderate,
  • if there is a clinical replication then the category can be Definitive
  1. DEFINITIVE: 12-18 points; Replication over time!?
  2. MODERATE: 7-11.99 points
  3. LIMITED: 0.1-6.99 points
  4. NO KNOWN RELATION: 0 points; contradictory evidence

--> access the curation strategy sheet here: https://docs.google.com/spreadsheets/d/1KS9G2YR9U6uheu0zC-7zvMaSWCZv7UeVYh69BrkQK8M/edit#gid=0

TODO:

  • define the scoring logic for the final table with cutoffs for the different categories
  • scoring logic for publications (screening = 1 point, first clinical description = 2 points, clinical replication = 3 points)
  • scoring logic: we further use the category "no known relation" for genes that are not trustworthy and associated with kidney disease
  • Write documentation for the initial manual curation effort
  • Set-up and finalize the curation sheet with automated high-evidence Gene implementation
  • Consider an automated display of a mouse-model score (2 points for a model, up to 4 points) from MGI mouse data if the Mode of Inheritance (MOI) of MGI fits the MOI of the Gene-Disease-Entity
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants