Skip to content

DeepHash: Different dataframes get the same hash #394

Open
@amakelov

Description

@amakelov

Describe the bug
Hash collision seems to happen whenever two dataframes have the same column names, regardless of the rows.

To Reproduce

from deepdiff import DeepHash
x = pd.DataFrame({'a': [1, 2, 3]})
y = pd.DataFrame({'a': [1, 2, 3, 4]})
a = DeepHash(x)[x]
b = DeepHash(y)[y]
assert a == b

Expected behavior
Collisions should be harder to find than this (unless this was designed into the library?)

OS, DeepDiff version and Python version (please complete the following information):

  • OS: Ubuntu 22.04.2 LTS
  • Python Version: 3.10.8
  • DeepDiff Version: 6.3.0

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions