Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

certbund-contact: ripe importer several contact entries for the same email address #3

Open
bernhardreiter opened this issue Jan 23, 2017 · 1 comment

Comments

@bernhardreiter
Copy link
Member

The following query shows that our ripe_importer creates entries on the contact_automatic
table that have the same email address and only differ in the id. How should these be dealt with?

select c, count(c), email from (
    select count(*) as c, co.email as email from contact_automatic as co 
      JOIN role_automatic AS r 
        ON co.id = r.contact_id
      GROUP BY co.email
  ) AS foo 
  GROUP BY c, email ORDER BY c DESC;

Here the distribution (without giving out the email addresses)

c  | count 
----+-------
 36 |     1
 16 |     1
 15 |     1
 14 |     1
 13 |     1
 12 |     1
 10 |     1
  9 |     1
  8 |     1
  7 |     1
  5 |     1
  4 |    11
  3 |     7
  2 |    37
  1 |  1574

Queries ran on a database 2017-01-23 imported for DE like outlined in
https://github.com/Intevation/intelmq/blob/473fed97ca323ba91126edd0fc208711613ffac4/intelmq/bots/experts/certbund_contact/README-ripe-import.md

@bernhardreiter
Copy link
Member Author

The problem arises because we do not save all information of the ripe.db.role.gz entry into the database. The question is: Should we search an link an existing entry when importing?

@bernhardreiter bernhardreiter transferred this issue from Intevation/intelmq Sep 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant