Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of residue IDs/numbers in GROMACS files #1016

Closed
mattwthompson opened this issue Jul 19, 2024 · 8 comments
Closed

Improve handling of residue IDs/numbers in GROMACS files #1016

mattwthompson opened this issue Jul 19, 2024 · 8 comments
Labels
gromacs relating to GROMACS

Comments

@mattwthompson
Copy link
Member

mattwthompson commented Jul 19, 2024

  • Residue IDs
  • If residue IDs (residue indices) are provided in inputs [atom.metadata["residue_number"] for atom in topology.atoms], these must be respected
  • If residue IDs are not provided in inputs, probably generate them on-the-fly at export time using Topology.molecule_index on each (non-unique) molecule
    • These IDs must refer to the actual molecules, not the "unique" molecules resulting from de-duplication
  • If some but not all residue IDs are provided in inputs, I don't know what to do
  • I don't know what to do if residue IDs are not contiguous
  • I don't know what to do with many (> 100,000) residues Improve handling of residue IDs/numbers in GROMACS files #1016 (comment)
@mattwthompson
Copy link
Member Author

ping @PEFrankel @timbernat

@mattwthompson
Copy link
Member Author

possibly also @hannahbaumann @IAlibay

@IAlibay
Copy link

IAlibay commented Jul 19, 2024

Generally I agree with:

  • Respecting user inputs
  • Generating IDs on the fly when none are present

However, I would very lightly suggest allowing users to optionally reindex their residues / generate IDs on the fly even if they have resids defined. This might just me being lazy and not wanting to have my own "increment by 1" loop everywhere though.

If some but not all residue IDs are provided in inputs

My take here would be to fail with a good error message that tells folks how to fix things.

I don't know what to do if residue IDs are not contiguous

I can't remember if Gromacs requires this - at least I don't think we make any checks for this in MDAnalysis. If GMX can live with it, then I'd just raise a warning and let it happen.

Something to consider: Not sure if you've already accounted for this, but one issue you could face however is if you exceed 100000 residues. If I remember right the convention in GMX is to cycle back to 1 in residue numbering. See: MDAnalysis/mdanalysis#728

@mattwthompson
Copy link
Member Author

I agree it would be nice to have users including yourself not need the increment-by-one loop; my only hesitation is that a lot of opinions and configuration can seep into these sorts of auto-magical fixes over time, so I want to be careful before going down that path

@mattwthompson
Copy link
Member Author

mattwthompson commented Jul 23, 2024

I'm going to begin implementing these changes and adding tests. (Some of this behavior may already work but be un-tested.) I'm also just going to document the stuff I'm unsure about - feedback on the above points still welcome

@mattwthompson
Copy link
Member Author

I've done everything I set out to do now in #1024

I think offering users a way to re-index their residue IDs is useful, but this is something that the toolkit should handle. Currently, I think there's room for improvement in how the toolkit supports and documents what users can do in cases like this, and it makes more sense for the toolkit to be where those improvements take place

@mattwthompson
Copy link
Member Author

Just a update on this for visibility: I think the linked PR is good to go, but we've enlisted @Yoshanuikabundi to review it. I anticipate some changes will be needed but also that it can cross the finish line next week.

@mattwthompson
Copy link
Member Author

I've merged in some changes that should handle most of this, landing in both the next 0.3.x release and 0.4.0

@mattwthompson mattwthompson added the gromacs relating to GROMACS label Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gromacs relating to GROMACS
Projects
None yet
Development

No branches or pull requests

2 participants