-
Notifications
You must be signed in to change notification settings - Fork 873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix array comparison in core.Structure.merge_sites
, also allow int
property to be merged instead of float
alone, mode
only allow full name
#4198
base: master
Are you sure you want to change the base?
Conversation
1823057
to
ab07a03
Compare
ab07a03
to
644c12d
Compare
@@ -4746,7 +4746,7 @@ def merge_sites(self, tol: float = 0.01, mode: Literal["sum", "delete", "average | |||
offset = self[i].frac_coords - coords | |||
coords += ((offset - np.round(offset)) / (n + 2)).astype(coords.dtype) | |||
for key in props: | |||
if props[key] is not None and self[i].properties[key] != props[key]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO tag: Using array_equal
may not be a good idea either as the property could be (sequence of) floats
TODO2: test failure
pymatgen/tests/core/test_structure.py
Lines 1667 to 1681 in 31f1e1f
# Test that we can average the site properties that are floats | |
lattice = Lattice.hexagonal(3.587776, 19.622793) | |
species = ["Na", "V", "S", "S"] | |
coords = [ | |
[0.333333, 0.666667, 0.165000], | |
[0, 0, 0.998333], | |
[0.333333, 0.666667, 0.399394], | |
[0.666667, 0.333333, 0.597273], | |
] | |
site_props = {"prop1": [3.0, 5.0, 7.0, 11.0]} | |
navs2 = Structure.from_spacegroup(160, lattice, species, coords, site_properties=site_props) | |
navs2.insert(0, "Na", coords[0], properties={"prop1": 100.0}) | |
navs2.merge_sites(mode="a") | |
assert len(navs2) == 12 | |
assert 51.5 in [itr.properties["prop1"] for itr in navs2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Property can actually be anything, including value supplied by user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for stepping in, I'm here adding a note for myself to work on later.
Yes in this case using ==
for comparison may not be ideal when the value could be float or np.array
, this is very similar to #4092 where we want to compare the equality of two dict whose value could be np.array
But I assume we would not be able to use np.allclose
as the value could be non-numerical, so as far as I'm aware, using array_equal
would be the best approach (though handling of float would be sub-optimal)
dist_mat = self.distance_matrix | ||
# TODO: change the code the allow full name after 2025-12-01 | ||
# TODO2: add a test for mode value, currently it only checks if first letter is "s/a" | ||
if mode.lower() not in {"sum", "delete", "average"} and mode.lower()[0] in {"s", "d", "a"}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want to hear more opinions on this, I guess it's beneficial to allow only full name ("sum", "delete", "average"
) instead of checking first letter only to make use of the IDE auto-complete feature and facilitate typing
Also using the full name would be more readable: mode="sum"
instead of mode="s"
src/pymatgen/core/structure.py
Outdated
for n, i in enumerate(inds[1:]): | ||
sp = self[i].species | ||
if mode.lower()[0] == "s": | ||
species += sp | ||
offset = self[i].frac_coords - coords | ||
coords += ((offset - np.round(offset)) / (n + 2)).astype(coords.dtype) | ||
for key in props: | ||
if props[key] is not None and self[i].properties[key] != props[key]: | ||
if props[key] is not None and not np.array_equal(self[i].properties[key], props[key]): | ||
if mode.lower()[0] == "a" and isinstance(props[key], float): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess only allowing float
to be averaged may not be a good idea, we should include int
as well, otherwise properties={"prop1": 100}
would lead to the site property being reset to None
while properties={"prop1": 100.0}
doesn't, which is super confusing
pymatgen/tests/core/test_structure.py
Lines 1677 to 1679 in 31f1e1f
navs2 = Structure.from_spacegroup(160, lattice, species, coords, site_properties=site_props) | |
navs2.insert(0, "Na", coords[0], properties={"prop1": 100.0}) | |
navs2.merge_sites(mode="a") |
Also the docstring would be clarified to explain this behaviour.
core.Structure.merge_sites
core.Structure.merge_sites
, merge_sites
also allow int
property to be merged instead of float
alone, mode
only allow full name
core.Structure.merge_sites
, merge_sites
also allow int
property to be merged instead of float
alone, mode
only allow full namecore.Structure.merge_sites
, also allow int
property to be merged instead of float
alone, mode
only allow full name
b26b299
to
dd1b338
Compare
src/pymatgen/core/structure.py
Outdated
@@ -216,7 +216,7 @@ class SiteCollection(collections.abc.Sequence, ABC): | |||
""" | |||
|
|||
# Tolerance in Angstrom for determining if sites are too close | |||
DISTANCE_TOLERANCE = 0.5 | |||
DISTANCE_TOLERANCE: ClassVar[float] = 0.5 | |||
_properties: dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NEED CONFIRM on 091c911: Is _properties
misplaced? an instance variable maybe?
pymatgen/src/pymatgen/core/structure.py
Lines 211 to 220 in 31f1e1f
class SiteCollection(collections.abc.Sequence, ABC): | |
"""Basic SiteCollection. Essentially a sequence of Sites or PeriodicSites. | |
This serves as a base class for Molecule (a collection of Site, i.e., no | |
periodicity) and Structure (a collection of PeriodicSites, i.e., | |
periodicity). Not meant to be instantiated directly. | |
""" | |
# Tolerance in Angstrom for determining if sites are too close | |
DISTANCE_TOLERANCE = 0.5 | |
_properties: dict |
5b06c2a
to
091c911
Compare
Summary
core.Structure.merge_sites
, to fix Structure.merge_sites fails when selective_dynamics is property present #4197is_np_dict_equal
for custom__eq__
mode
initials (not starting witha/s/d
, currently don't check full name yet)Warning
Minor behaviour change (fix) in
merge_sites
property handlingOnly allowing
float
to be averaged may not be a good idea, we should includeint
as well, otherwiseproperties={"prop1": 100}
would lead to the site property being reset toNone
whileproperties={"prop1": 100.0}
doesn't, which is super confusingpymatgen/tests/core/test_structure.py
Lines 1677 to 1679 in 31f1e1f
Also the docstring is clarified to explain this behaviour.
Warning
mode
only allow "full name" instead of initials after an one-year deprecation periodI believe it's beneficial to allow only full name (
"sum", "delete", "average"
) instead of checking first letter only:mode="sum"
instead ofmode="s"
pymatgen/src/pymatgen/core/structure.py
Line 4719 in 31f1e1f