Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for hierarchical biomolecular fields #396

Draft
wants to merge 9 commits into
base: develop
Choose a base branch
from

Conversation

JPBergsma
Copy link
Contributor

@JPBergsma JPBergsma commented Dec 29, 2021

This is the second option on how to include biomolecular data within Optimade. It is the second option that is discussed in issue #389. The first option is described in PR #395.

It introduces a group_type that represents kinds of molecules or groups of groups and sites.
A group of groups is for example a protein, here the individual amino acids are the subgroups of the much larger protein strand.
Properties that are specific for a particular molecule or group of atoms, such as a molecular formula or a mass, can be stored here.
The groups property is introduced to describe instances of these group_types, i.e. a particular molecule or chemical group.

This gives a "top down" data structure, the groups know which atoms belong to the group, but the atoms do not know to which group they belong. I do not think this is a problem, as the client can easily reconstruct this information. If necessary, we could also add a property for each site that describes to which groups the sites belong.

It would be nice if you could tell me which method you prefer, the one described here or the one described in PR #395, and any other remarks about these the options are of course also welcome.

For now, I have placed these fields in the appendix, as I am not sure whether they will be used by many OPTIMADE providers.
They can be moved to the structures section if you think they are general enough.

group_type
~~~~~~~~~~

- **Description**: For each type of chemical group/molecule in the system there is a dictionary that describes this group/molecule.
Copy link
Contributor Author

@JPBergsma JPBergsma Dec 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to have some way of indicating that a group is at the top level, i.e. that it is a molecule and not a subgroup?

- **name**: REQUIRED; The name of the group_type;
- The **name** value MUST be unique in the :property:`group_types` list.
- Strings of 3 characters or less MUST match the strings belonging to this group as defined by wwPDB at `<ftp://ftp.wwpdb.org/pub/pdb/data/monomers>`_.
- **molecular_formula**: OPTIONAL; The molecular formula of the molecule/group described in this group type.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the list of groups of wwPDB, the groups are described without taking any bonding to other groups into account. When amino acids are polymerized, each amino acid loses a water molecule.
I am therefore wondering whether I should have different group types for bound and unbound amino acids. In the list of wwPDB there is however just one entry for each type of amino acid.
For now, I am still wondering about how to handle this case.

@d-beltran d-beltran mentioned this pull request Feb 23, 2022
@ml-evs ml-evs changed the title Jp bergsma add hierarchical bio fields Support for hierarchical biomolecular fields Jun 1, 2022
@ml-evs ml-evs added topic/property-standardization The specification of the precise data representation of properties and entries type/proposal Proposal for addition/removal of features. May need broad discussion to reach consensus. status/has-concrete-suggestion This issue has one or more concrete suggestions spelled out that can be brought up for consensus. PR/major-edits labels Jun 1, 2022
@rartino rartino mentioned this pull request Jan 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR/major-edits status/has-concrete-suggestion This issue has one or more concrete suggestions spelled out that can be brought up for consensus. topic/property-standardization The specification of the precise data representation of properties and entries type/proposal Proposal for addition/removal of features. May need broad discussion to reach consensus.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants