Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for record id of bytes type #1152

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lmores
Copy link
Contributor

@lmores lmores commented Mar 17, 2023

Add support for id columns of type bytes.

Reason: when the primary key of a row is the output of a hashing function, the most natural data type used for the corresponding database column is simply raw bytes (e.g., the BYTEA data type in postgreSQL).

I have already successfully used this patched version of dedupe to deduplicate rows from a postgresSQL table having the primary column of type BYTEA.

Let me know if:

  1. you are interested in adding support for id columns with bytes type
  2. you prefer to keep this change under the hood (as it is now) or make it explicit and update all the typing definitions in _typing.py

@codecov
Copy link

codecov bot commented Mar 17, 2023

Codecov Report

Patch coverage has no change and project coverage change: -0.07 ⚠️

Comparison is base (f72d4a1) 73.71% compared to head (68aa7d7) 73.65%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1152      +/-   ##
==========================================
- Coverage   73.71%   73.65%   -0.07%     
==========================================
  Files          29       29              
  Lines        2321     2323       +2     
==========================================
  Hits         1711     1711              
- Misses        610      612       +2     
Impacted Files Coverage Δ
dedupe/core.py 62.82% <0.00%> (-0.67%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@lmores
Copy link
Contributor Author

lmores commented Mar 24, 2023

@fgregg: any thoughts on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant