-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for hierarchical biomolecular fields #396
base: develop
Are you sure you want to change the base?
Support for hierarchical biomolecular fields #396
Conversation
…o store biomolecular data.
group_type | ||
~~~~~~~~~~ | ||
|
||
- **Description**: For each type of chemical group/molecule in the system there is a dictionary that describes this group/molecule. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to have some way of indicating that a group is at the top level, i.e. that it is a molecule and not a subgroup?
- **name**: REQUIRED; The name of the group_type; | ||
- The **name** value MUST be unique in the :property:`group_types` list. | ||
- Strings of 3 characters or less MUST match the strings belonging to this group as defined by wwPDB at `<ftp://ftp.wwpdb.org/pub/pdb/data/monomers>`_. | ||
- **molecular_formula**: OPTIONAL; The molecular formula of the molecule/group described in this group type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the list of groups of wwPDB, the groups are described without taking any bonding to other groups into account. When amino acids are polymerized, each amino acid loses a water molecule.
I am therefore wondering whether I should have different group types for bound and unbound amino acids. In the list of wwPDB there is however just one entry for each type of amino acid.
For now, I am still wondering about how to handle this case.
This is the second option on how to include biomolecular data within Optimade. It is the second option that is discussed in issue #389. The first option is described in PR #395.
It introduces a
group_type
that represents kinds of molecules or groups of groups and sites.A group of groups is for example a protein, here the individual amino acids are the subgroups of the much larger protein strand.
Properties that are specific for a particular molecule or group of atoms, such as a molecular formula or a mass, can be stored here.
The
groups
property is introduced to describe instances of these group_types, i.e. a particular molecule or chemical group.This gives a "top down" data structure, the groups know which atoms belong to the group, but the atoms do not know to which group they belong. I do not think this is a problem, as the client can easily reconstruct this information. If necessary, we could also add a property for each site that describes to which groups the sites belong.
It would be nice if you could tell me which method you prefer, the one described here or the one described in PR #395, and any other remarks about these the options are of course also welcome.
For now, I have placed these fields in the appendix, as I am not sure whether they will be used by many OPTIMADE providers.
They can be moved to the structures section if you think they are general enough.