-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collection ergonomics #990
Conversation
Seems worth it to me - but I'm not sure about other valence terms |
Okay, I see you've gone about sorting the other terms. I think this breaks angles and dihedrals, though, since they're not symmetric like bonds? |
We'd have to do it everywhere because this PR just indexes in directly. It might be better to just look up all equivalent index orderings?
Welp I didn't mean to, I'll add tests and double check EDIT: Yeah could you point out where I've done this?
Yeah I agree, you can reverse the order but I think that's the only symmetry op for both? |
OK I've just made it so that when indexing into a collection with a tuple, both the forward and reverse of the tuple will be looked up. I think this is a lot less intrusive than guaranteeing everywhere that the tuples are ordered. My understanding is that all valence terms can be indexed in either direction, but if this isn't the case then I can change it to only look up the reverse for 2-tuples. |
Nope, I think I got confused looking too superficially at which tests you've added |
Am I meant to be able to look up other valence terms, or is this messed up by all of those extra fields (some of which might not be necessary)? In [12]: interchange = ForceField("openff-2.2.0.offxml").create_interchange(Molecule.from_smiles("
...: CCO").to_topology())
In [13]: next(iter(interchange['ProperTorsions'].key_map))
Out[13]: ProperTorsionKey with atom indices (0, 1, 2, 8), mult 0
In [14]: interchange['ProperTorsions'].key_map[0, 1, 2, 8]
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[14], line 1
----> 1 interchange['ProperTorsions'].key_map[0, 1, 2, 8]
KeyError: (0, 1, 2, 8) |
Yeah that's on purpose - since I implemented this by defining I thought about allowing more information to be given in the tuple - for example, to index the above torsion you could do something like: interchange['ProperTorsions'].key_map[(0, 1, 2, 8), 0]
# Or
interchange['ProperTorsions'].key_map[(0, 1, 2, 8), 0, None, None] But since users would have to remember the correct ordering for everything and there's no way to disambiguate the name of the field you're specifying and all the fields have the same type I thought this provided a relatively small ergonomics boost. If users want to specify the extended fields, they can still index with a Here are some ways I've thought about dealing with this:
Note also that indexing into In [10]: next(iter(interchange['Bonds'].key_map))
Out[10]: BondKey with atom indices (0, 1)
In [11]: interchange['Bonds'][0, 1] # Returns the underlying potential
Out[11]: Potential(parameters={'k': <Quantity(430.602781, 'kilocalorie_per_mole / angstrom ** 2')>, 'length': <Quantity(1.5336276, 'angstrom')>}, map_key=None)
In [12]: interchange['Bonds'][1, 0] # Works with reversed keys
Out[12]: Potential(parameters={'k': <Quantity(430.602781, 'kilocalorie_per_mole / angstrom ** 2')>, 'length': <Quantity(1.5336276, 'angstrom')>}, map_key=None)
In [13]: interchange['Bonds'].key_map[0, 1] # Returns the PotentialKey
Out[13]: PotentialKey associated with handler 'Bonds' with id '[#6X4:1]-[#6X4:2]'
In [14]: interchange['Bonds'].key_map[1, 0] # Doesn't work with reversed keys
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[14], line 1
----> 1 interchange['Bonds'].key_map[1, 0]
KeyError: (1, 0) |
assert BondKey(atom_indices=(1, 3), bond_order=None) != ((1, 3), None) | ||
assert BondKey(atom_indices=(1, 3), bond_order=1.5) != (1, 3) | ||
assert BondKey(atom_indices=(1, 3), bond_order=1.5) != ((1, 3), None) | ||
assert BondKey(atom_indices=(1, 3), bond_order=1.5) != ((1, 3), 1.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure these two should be equal?
def __hash__(self) -> int:
if self.bond_order is None:
return hash(tuple(self.atom_indices))
return hash((tuple(self.atom_indices), self.bond_order))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this test is correct - Lines 131, 133 and 134 are all extended tuples, which has not been implemented (as we've discussed elsewhere), and in line 133 the bond order is not None
so the tuple doesn't match.
The second return line in that hash function is the existing behaviour - BondKey.__eq__
is
def __eq__(self, other) -> bool:
return super().__eq__(other) or (
self.bond_order is None and other == self.atom_indices
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh I see what's going on, TopologyKey.__eq__
was changed to compare hash functions. I don't think this is generally technically correct as you might get hash collisions - Python uses a 64 bit hash, so there are 2^64 possible hashes, and a typical simulation system might have 2^16 atoms (~65k), so if you need four of them to define a torsion then you might have 2^64 possible atom index tuples... which is really filling up the pigeon holes. Granted torsions tend to be defined between nearby atoms but that's also assuming that Python's hash function is doing a good job of avoiding collisions, and it's had pathological cases in the past.
A better way might be to have a _tuple()
function that gives the equivalent tuple, and then define __hash__
as return hash(self._tuple())
and __eq__
as something like return self._tuple() == other._tuple() or self._tuple() == other
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparing hashes in __eq__
can also run into issues when comparing objects of different types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm past providing useful feedback on this changelog, which is long past due its green check. I want to point this to the develop
branch which targets an upcoming 0.4.x line; one test confuses me but otherwise please merge this when you're happy with it (% any confusion I've added).
I get why equality across extended tuples is tricky; given that the motivation of this behavior is making our lives easier, I'm fine leaving that door closed.
I looked through the docs to find a good place to show this off, but I didn't find anything. In the future that might be worth a subsection, either in the static docs or a notebook(s) somewhere. I expected there to be tons of these excessiely-difficult-lookup cases, but now I think those are just in tests and the source code. I don't see much of a reason to rewrite what's there, but with any luck I'll remember this trick and future code will be easier to write and a little bit more sightly to look at. |
Ok well I implemented extended tuples as a way to keep the There are still a few topology key classes that are not subclasses of If you're happy with this change set, I'm happy for you to merge it at your convenience! |
Description
Implements easier indexing into collections and dictionaries of potentials. Closes #968.
Works by defining equality between certain
TopologyKey
subclasses and tuples of atom indices when non-atom index attributes areNone
. Also implements__getitem__
onCollection
to index directly from atom indices to potential.I had to change the hash implementations to not include
None
in the hash if all non-atom indices attributes wereNone
so that the hash matches between theTopologyKey
and the appropriate tuple. This might occasionally cause collisions... but it should be extremely rare and regardless Python's requirement is that__eq__
implies hashes are equal, not the converse.Atom indices for bonds have to be in the correct order. Fixing this for indexing with tuples would require replacing
Collection.key_map
with a custom class. We could make the ergonomics easier by guaranteeing that atom indices in bonds are always ordered a certain way?Checklist