-
-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hash function for AffineScalarFunc class #189
Changes from 17 commits
d90c541
5d429fe
40154ce
634db47
7cca18c
af3447c
f3cb615
5e40c49
cd3b7e0
a2d4bb1
d74a9d1
9a9d6d5
3e0b064
1e57a2f
e4ef9e1
4d8d268
605b5cd
82285df
3c64ae3
185aa0d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1756,6 +1756,19 @@ def __bool__(self): | |
|
||
######################################## | ||
|
||
def __hash__(self): | ||
""" | ||
Calculates the hash for any AffineScalarFunc object. | ||
The hash is calculated from the nominal_value, and the derivatives. | ||
|
||
Returns: | ||
int: The hash of this object | ||
""" | ||
|
||
items = sorted(self.derivatives.items(), key= lambda x: id(x[0])) | ||
ids, values = zip(*map(lambda item: (id(item[0]), item[1]), items)) | ||
return hash((self._nominal_value, tuple(ids), tuple(values))) | ||
|
||
# Uncertainties handling: | ||
|
||
def error_components(self): | ||
|
@@ -2729,7 +2742,7 @@ def __init__(self, value, std_dev, tag=None): | |
# differentiable functions: for instance, Variable(3, 0.1)/2 | ||
# has a nominal value of 3/2 = 1, but a "shifted" value | ||
# of 3.1/2 = 1.55. | ||
value = float(value) | ||
self._nominal_value = float(value) | ||
|
||
# If the variable changes by dx, then the value of the affine | ||
# function that gives its value changes by 1*dx: | ||
|
@@ -2739,7 +2752,7 @@ def __init__(self, value, std_dev, tag=None): | |
# takes much more memory. Thus, this implementation chooses | ||
# more cycles and a smaller memory footprint instead of no | ||
# cycles and a larger memory footprint. | ||
super(Variable, self).__init__(value, LinearCombination({self: 1.})) | ||
super(Variable, self).__init__(self._nominal_value, LinearCombination({self: 1.})) | ||
|
||
self.std_dev = std_dev # Assignment through a Python property | ||
|
||
|
@@ -2766,6 +2779,22 @@ def std_dev(self, std_dev): | |
|
||
self._std_dev = float(std_dev) | ||
|
||
def __hash__(self): | ||
""" | ||
Calculates the hash for any Variable object. | ||
This simply calls the __hash__ method of `AffineScalarFunc` in most cases. | ||
During initialization, the `_linear_part` and `_nominal_value` attributes does not exist. | ||
Because of that, the id returned in case this method is called during initialization. | ||
|
||
Returns: | ||
int: The hash of this object | ||
""" | ||
NelDav marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# Otherwise, pickles loads does not work | ||
if not hasattr(self, "_nominal_value"): | ||
self._nominal_value = None | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the issue here is that currently I think it would be better to break this cycle by overriding With the current implementation, y = ufloat(1, 1)
new_y = pickle.loads(pickle.dumps(y))
assert y in y.derivatives # Passes
assert new_y in new_y.derivatives # Raises KeyError I am not sure this is worth adding to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the problem is, that pickle already generates the I think we can prevent that by deleting the self reference inside the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried some stuff to handle this in the # Corner case test: when an attribute is present both in __slots__
# and in __dict__, it is first looked up from the slots
# (references:
# http://docs.python.org/2/reference/datamodel.html#invoking-descriptors,
# http://stackoverflow.com/a/15139208/42973). As a consequence,
# the pickling process must pickle the correct value (i.e., not
# the value from __dict__):
x = NewVariable_dict(3, 0.14)
x._nominal_value = 'in slots'
# Corner case: __dict__ key which is also a slot name (it is
# shadowed by the corresponding slot, so this is very unusual,
# though):
x.__dict__['_nominal_value'] = 'in dict'
# Additional __dict__ attribute:
x.dict_attr = 'dict attribute'
x_unpickled = pickle.loads(pickle.dumps(x)) I do not completely understand, what this test is for. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the cleanest option would be to override There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That would only work if it is guaranteed, that the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think so. I think it is only There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I implemented the override of |
||
|
||
return hash((self._nominal_value, (id(self),), (1.,))) | ||
|
||
# The following method is overridden so that we can represent the tag: | ||
def __repr__(self): | ||
|
||
|
@@ -2776,13 +2805,6 @@ def __repr__(self): | |
else: | ||
return "< %s = %s >" % (self.tag, num_repr) | ||
|
||
def __hash__(self): | ||
# All Variable objects are by definition independent | ||
# variables, so they never compare equal; therefore, their | ||
# id() are allowed to differ | ||
# (http://docs.python.org/reference/datamodel.html#object.__hash__): | ||
return id(self) | ||
|
||
def __copy__(self): | ||
""" | ||
Hook for the standard copy module. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,7 +30,7 @@ | |
# Local modules | ||
|
||
import uncertainties.core as uncert_core | ||
from uncertainties.core import ufloat, AffineScalarFunc, ufloat_fromstr | ||
from uncertainties.core import ufloat, AffineScalarFunc, ufloat_fromstr, LinearCombination | ||
from uncertainties import umath | ||
|
||
# The following information is useful for making sure that the right | ||
|
@@ -2338,3 +2338,40 @@ def test_correlated_values_correlation_mat(): | |
assert arrays_close( | ||
numpy.array(cov_mat), | ||
numpy.array(uncert_core.covariance_matrix([x2, y2, z2]))) | ||
|
||
def test_hash(): | ||
''' | ||
Tests the invariance that if x==y, then hash(x)==hash(y) | ||
''' | ||
|
||
a = ufloat(1.23, 2.34) | ||
b = ufloat(1.23, 2.34) | ||
|
||
# nominal values and std_dev terms are equal, but... | ||
assert a.n==b.n and a.s==b.s | ||
# ...x and y are independent variables, therefore not equal as uncertain numbers | ||
assert a != b | ||
assert hash(a) != hash(b) | ||
|
||
# order of calculation should be irrelevant | ||
assert a + b == b + a | ||
assert hash(a + b) == hash(b + a) | ||
|
||
# the equation (2x+x)/3 is equal to the variable x, so... | ||
assert ((2*a+a)/3)==a | ||
# ...hash of the equation and the variable should be equal | ||
assert hash((2*a+a)/3)==hash(a) | ||
Comment on lines
+2361
to
+2363
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This one is simple enough that it works out okay, but if the calculation got more complicated than There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I think you are right. |
||
|
||
c = ufloat(1.23, 2.34) | ||
|
||
# the values of the linear combination entries matter | ||
x = AffineScalarFunc(1, LinearCombination({a:1, b:2, c:1})) | ||
y = AffineScalarFunc(1, LinearCombination({a:1, b:2, c:2})) | ||
assert x != y | ||
assert hash(x) != hash(y) | ||
|
||
# the order of linear combination values matter and should not lead to the same hash | ||
x = AffineScalarFunc(1, LinearCombination({a:1, b:2})) | ||
y = AffineScalarFunc(1, LinearCombination({a:2, b:1})) | ||
assert x != y | ||
assert hash(x) != hash(y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is an alternative suggestion. It is the same thing (except puts the derivative into a tuple of tuples instead of producing two equal length tuples) but maybe more readable (I guess depends on if you like comprehensions vs map/lambda). In an ideal world, we could just do
frozendict(self.derivatives)
but there is nofrozendict
in the standard library.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is
MappingProxyType
, but I do not know, if this would work.https://docs.python.org/3/library/types.html#types.MappingProxyType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MappingProxyType
is not hashable, so it will not work.I think your solution is more elegant, but for some reason,
assert hash((2*a+a)/3)==hash(a)
fails when using your proposal.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you update
__hash__
for bothAffineScalarFunc
andVariable
? I suggested updating one above, but both need to be updated consistently. ForVariable
the changed version should be like:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I did a mistake with the parenthesis.
Now it works :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I realized -- we moved away from
frozenset
because duplicate derivative coefficient values could be combined and lead to more hash collisions, but that was before we switched to have a collection of tuples instead of two collections with ids and coefficients. Instead ofwe could do
since the ids are always unique and so no tuples will get combined in the set. Whether that is more or less readable or efficient, I am not sure. I like not sorting when it's not necessary but that is pretty minor. I was curious if
frozenset.__hash__
used sorting for its implementation, but I checked and what it does is XOR the hashes of the elements together which is neat because the XOR operation is also order independent.