Switch to reference genome set for AmiGO, all APIs, and data products #48

kltm · 2022-10-28T23:16:05Z

Project link

https://github.com/orgs/geneontology/projects/117

Project description

A potential continuation of #82.

We'd like to:

make all available all species in AmiGO and the API; this would require changes in the pipeline, mostly with scaling
adding QC (currently ontobio) and GPAD output for these products
do full species sorting for all data products; some adjustments to downloads and announcement

For examining scope, also see:
geneontology/pipeline#246
geneontology/pipeline#204

PI

Chris

Project owner (PO)

Pascale

Technical lead (TL)

Seth

Other personnel (OP)

Seth, Sierra, Dustin, Patrick

Technical specs

https://docs.google.com/document/d/1CU4Zp7t-wTRlcsXlMndIBr2XNUjeBxINdv-O5XPUERQ/edit

Other comments

N/A

kltm · 2022-12-06T21:23:20Z

@pgaudet Added new project; we'd also likely be using myself, Dustin, and/or Sierra.

kltm · 2022-12-06T21:23:56Z

Before digging into what we want to do, we should decide the final scope.

pgaudet · 2023-03-02T16:12:03Z

Updates& discussion on 2023-03-01 Managers call:

Current load time ~ 2 days
Seth proposes that this becomes an active project
may need to get a new machine
advantage is that this would be consistent with Panther & allow species-specific downloads
Patrick can also work on this
Announce enough in advance so that use dont have to maintain old and new files
However: pipeline cleanup and API projects should be wrapped up before getting this started

kltm · 2023-03-11T00:44:55Z

kltm · 2023-08-30T00:19:23Z

@pgaudet Talking @cmungall , maybe to better make progress with smaller incremental steps, we could unbundle the initial task of make the "142" available as GAF 2.2s.

cmungall · 2023-08-30T00:32:15Z

Proposed new scope for this project:

the end-goal is to have the GAF download page (http://current.geneontology.org/products/pages/downloads.html) have the 142 reference species, broken down by species, available as GAFs
the non-core GAFs will not go through the same QC process pipeline as the core (MOD + goa human/cow/etc)
the implementation will be simply to take the filtered file provided by Alex (gcrp entries for 142 species; ftp://ftp.ebi.ac.uk/pub/contrib/goa/go_reference_species.gaf.gz) and bin these into separate files, one per species
a single static HTML file will be produced
The UI implementation will abandon the pageanated table and be a simple table with 142 rows in one page
The table should be sortable?
By default the sort order should prioritize the main curated species
the columns will be
- species
- count
- link to download
- potentially: primary source (ebigoa, mgi, ...)
there will not be separate rna/isoform files; these will be merged in. One file per species

kltm · 2023-08-30T00:37:37Z

@pgaudet, More talk w/Chris, assuming we're all on the same page about this, I'd probably try and push through on this myself (spinning it out as a separate project first).

pgaudet · 2023-08-30T09:10:36Z

Sounds good to me.

One suggestion (if doable) would be to have 2 table, one for the MODs, and one for all others, since 142 (in fact it's now 143) species is a long list to scroll through.

Thanks, Pascale

kltm · 2023-08-30T21:57:02Z

@pgaudet @cmungall Re-scoping this as the "strong" version; the narrow version discussed above is now #82

kltm · 2023-08-30T23:52:05Z

Moving this back to "priority", as #82 will meet some initial desirable targets and unsure if the rest is on the table immediately.

kltm added Needs LA approval Needs final approval from the Lead Architect Needs PI Needs PM approval Needs final approval from the Project Manager Needs PO Needs tech doc Needs TL labels Oct 28, 2022

kltm assigned pgaudet Oct 28, 2022

kltm added this to Project Metadata Overview Oct 28, 2022

kltm removed this from Project Metadata Overview Oct 28, 2022

kltm added this to Project Metadata Overview Nov 17, 2022

kltm moved this to Hopper in Project Metadata Overview Nov 17, 2022

kltm moved this from Hopper to Priority (project triage) in Project Metadata Overview Dec 6, 2022

pgaudet moved this from Priority (project triage) to Creation (initial requirements document) in Project Metadata Overview Mar 1, 2023

kltm changed the title ~~Change available download data (GAFs/GPADs) to reflect species rather than resource~~ Switch to reference genome set (was change available download data (GAFs/GPADs) to reflect species rather than resource) Mar 2, 2023

pgaudet mentioned this issue Jul 28, 2023

QC: IEA mapping updates geneontology/go-annotation#2531

Closed

kltm mentioned this issue Aug 30, 2023

Add reference genome set to GAF downloads #82

Open

kltm changed the title ~~Switch to reference genome set (was change available download data (GAFs/GPADs) to reflect species rather than resource)~~ Switch to reference genome set for AmiGO, all APIs, and data products Aug 30, 2023

kltm moved this from Creation (initial requirements document) to Priority (project triage) in Project Metadata Overview Aug 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to reference genome set for AmiGO, all APIs, and data products #48

Switch to reference genome set for AmiGO, all APIs, and data products #48

kltm commented Oct 28, 2022 •

edited

Loading

kltm commented Dec 6, 2022

kltm commented Dec 6, 2022

pgaudet commented Mar 2, 2023

kltm commented Mar 11, 2023 •

edited

Loading

kltm commented Aug 30, 2023

cmungall commented Aug 30, 2023 •

edited

Loading

kltm commented Aug 30, 2023 •

edited

Loading

pgaudet commented Aug 30, 2023

kltm commented Aug 30, 2023

kltm commented Aug 30, 2023 •

edited

Loading

Switch to reference genome set for AmiGO, all APIs, and data products #48

Switch to reference genome set for AmiGO, all APIs, and data products #48

Comments

kltm commented Oct 28, 2022 • edited Loading

Project link

Project description

PI

Project owner (PO)

Technical lead (TL)

Other personnel (OP)

Technical specs

Other comments

kltm commented Dec 6, 2022

kltm commented Dec 6, 2022

pgaudet commented Mar 2, 2023

kltm commented Mar 11, 2023 • edited Loading

kltm commented Aug 30, 2023

cmungall commented Aug 30, 2023 • edited Loading

kltm commented Aug 30, 2023 • edited Loading

pgaudet commented Aug 30, 2023

kltm commented Aug 30, 2023

kltm commented Aug 30, 2023 • edited Loading

kltm commented Oct 28, 2022 •

edited

Loading

kltm commented Mar 11, 2023 •

edited

Loading

cmungall commented Aug 30, 2023 •

edited

Loading

kltm commented Aug 30, 2023 •

edited

Loading

kltm commented Aug 30, 2023 •

edited

Loading