Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove ChemicalEntity as range for metabolite_identified slot #235

Closed
wants to merge 6 commits into from

Conversation

kheal
Copy link

@kheal kheal commented Aug 2, 2024

Description

This PR will:

  1. Remove ChemicalEntity as range for metabolite_identified slot.
  2. Add structured_pattern constraint and requirement on id slot for ChemicalEntity class and invalid test for bad typecode
  3. Add structured_pattern constraint on known_as slot to match its range of ChemicalEntity and adds description for known_as slot
  4. Remove id_prefixes from ChemicalEntity class - @sierra-moxon or @turbomam please confirm this is appropriate as this is my first exposure to that slot.

Reviewers

to review: @mslarae13, @turbomam, @sierra-moxon

to tag for schema freeze purposes: @aclum, @mbthornton-lbl, @corilo, @SamuelPurvine, @naglepuff, @sujaypatil96, @brynnz22, @eecavanna

PR Information

What type of PR is this? (check all applicable)

  • Documentation
  • Schema change: Cleanup and preparation
    • updated the description of a slot
    • updated the range of a slot (made it wide open!)

Did you add/update any tests?

  • Yes

Could this schema change make it so any valid data becomes invalid?

  • No (no valid data in mongo for ChemicalEntity because there is no chemical_entity_set until the berkeley schema, therefore no way for there to be ChemicalEntitys in mongo)

Does this PR have any downstream implications?

  • No

@kheal kheal changed the title 2153 chemicalentity range Remove ChemicalEntity as range for metabolite_identified slot Aug 2, 2024
Copy link

github-actions bot commented Aug 2, 2024

PR Preview Action v1.4.7
🚀 Deployed preview to https://microbiomedata.github.io/berkeley-schema-fy24/pr-preview/pr-235/
on branch gh-pages at 2024-08-02 20:18 UTC

…aff) groups=20(staff),12(everyone),61(localaccounts),79(_appserverusr),80(admin),81(_appserveradm),98(_lpadmin),701(com.apple.sharepoint.group.1),33(_appstore),100(_lpoperator),204(_developer),250(_analyticsusers),395(com.apple.access_ftp),398(com.apple.access_screensharing),101(com.apple.access_ssh-disabled),400(com.apple.access_remote_ae) slot for class
@kheal kheal marked this pull request as ready for review August 2, 2024 20:00
@kheal
Copy link
Author

kheal commented Aug 13, 2024

@sierra-moxon @turbomam @mslarae13 - welcome back from GSC. Please review when you are able.

id:
required: true
structured_pattern:
syntax: "{id_nmdc_prefix}:chem-{id_shoulder}-{id_blade}$"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may need to change to a ChEBI prefix pattern per our discussions with Chris

@@ -1592,6 +1593,7 @@ slots:
inlined_as_list: true

known_as:
description: An identifier of a ChemicalEntity.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful for me to have a more detailed description here; in my (albeit strange) brain, "known_as" typically refers to a "name" or "alias" vs. an identifier...this could be tangential to this PR though! happy to refine description that in a subsequent PR.

Copy link
Author

@kheal kheal Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I find the name of this slot unintuitive and frankly problematic for the reason you point out. It was introduced during the Berkeley refactoring and there are no data in mongo using this slot, so if you can help to think of a more appropriate slot name and description, that would be much appreciated! I'd be happy to roll that into this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only domain of the slot known_as is the class PortionOfSubstance. I'd advocate to make a better slot name and description before we use it (aka, now ;) ).

@kheal kheal marked this pull request as draft August 14, 2024 18:59
@kheal
Copy link
Author

kheal commented Aug 14, 2024

@sierra-moxon I'm going to move this back to draft until we settle on ID structure!

@kheal kheal self-assigned this Aug 14, 2024
@turbomam
Copy link
Member

Is the invalid example data file invalid for a single reason?

Please include a super brief explanation for the invalidity in the filename, like src/data/invalid/ChemicalEntity-invalid-typecode.yaml

@kheal kheal closed this Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants