Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to remove/merge duplicates in the database (athletes/clubs) #32

Open
espinr opened this issue Feb 15, 2021 · 1 comment
Open

How to remove/merge duplicates in the database (athletes/clubs) #32

espinr opened this issue Feb 15, 2021 · 1 comment

Comments

@espinr
Copy link
Contributor

espinr commented Feb 15, 2021

During the monthly call on Feb 15, Andy R raised an open question about the best way for cleaning the database, and avoiding duplicates when inserting new entries. In concrete with competitors, athletes and clubs.

So, please list best practices and ways to detect exact and near duplicates.

For instance: Reversing dates (month, day), name/surname, etc.

@dbonacci
Copy link

Maybe it would be useful to use the WA athletes' codes for all the athletes that have them? I know that the coverage is quite high, as I myself have this code even though in my best days I was far from an elite level athlete, and even more so in the last couple of years since the WA athletes database has been in production.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants