-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
generalize physical entity classification #110
Comments
I believe this ticket is basically "if a reactome ID can be easily mapped to a MOD ID (or UniProtKB in human's case) then use it rather than creating a REACTO class". This is essentially what #326 is for in the human case. For other non-Reactome sources like YeastPathways, there are extra steps (like xref BioPAX ID, e.g., The "generalize" part of the title may allude to centralizing all of this ID conversion code in one place, which I think the methods In short, I think this ticket (if left open) should just cover the larger generalization refactor work, and other more specific requirements (e.g., "FlyBase IDs should be converted like this") should be addressed in separate tickets (ex: #326). |
@dustine32 For some reason I thought we had looked at the Reactome BioPax and for things like the Mouse projections, the MGI identifiers were included there. So it might be easier???? |
@ukemi Whether or not fetching MGI IDs is easy, I just think the work to do it should be tracked in a ticket more specific than this one since I assume (I know this means I'm wrong!) that the method/code may be different for other organisms or sources. Right now, this #110 ticket appears to me un-closable until we account for all different cases? What do you think? Actually, checking now, I don't see any MGI IDs in the |
Bummer. So maybe the best thing to do would be to use the UniProt identifiers and the mouse GPI file for mapping since that it supposed to be the 'official' cross references. Perhaps this can be used for everyone that provides a GPI file. But yes, I agree about that ticket. In fact since the GPI file is supposed to have annotatable objects in column 1, then there is no need to search for some kind of string match. Whatever is in column 1 from the Uniprot goes into the model. We should double check that this would work for fly, @sjm41 @rozaru , and worm, @vanaukenk . |
These xrefs are not in the database itself (gk_central) but are created on the fly as part of our release process. I expect we do not want them in our data structure because then we would need to maintain them to keep them current with changes in MGI, etc. Possible workarounds may be something to discuss with Adam Wright (sorry - I don't know his GitHub name). Even without Adam, an agenda item for weeds today? |
I think it would be best to use the GPI, which is maintained by MGI, as it should be. |
But thinking more about this, I think my column 1 suggestion is too simplistic because there are also uniprot xrefs to pro identifiers in column 1. We would want to use the MGI genes I would think. |
As of today, the conversion assumes that physical entities have specific classes associated with them generated automatically as a precondition of building the GO-CAMs. For Reactome this is the REACTO ontology.
For input resources that a) use gene identifiers present in the neo.owl ontology (coming from the GO central GPI file and b) do not use constructs like Sets and c) do not rely heavily on complexes, it would be useful to have the converter make use of the neo IRIs for the physical entity classes.
If the time comes to make this work, look for the currently hard-coded uses of 'GoCAM.reacto_base_iri' in the main BioPax2GO class to get started.
The text was updated successfully, but these errors were encountered: