You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Python 2.x: In 0.5.3 disco used md5 on a stringified version of the key mod the number of partitions to generate partition assignments by default. In 0.5.4 this switched to using the python hash() call. Calling this on tuples yields very few results when taken mod a low number like 100 - we're seeing in the range of 2-16 or so, yielding far few actual partitions than requested. Using strings works fine.
Seems like if keys need to be strings this should be enforced/documented, or a different hash function used?
The text was updated successfully, but these errors were encountered:
Python 2.x: In 0.5.3 disco used md5 on a stringified version of the key mod the number of partitions to generate partition assignments by default. In 0.5.4 this switched to using the python hash() call. Calling this on tuples yields very few results when taken mod a low number like 100 - we're seeing in the range of 2-16 or so, yielding far few actual partitions than requested. Using strings works fine.
Seems like if keys need to be strings this should be enforced/documented, or a different hash function used?
The text was updated successfully, but these errors were encountered: