Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle inserting duplicate data? #53

Open
RobStallion opened this issue Mar 4, 2019 · 1 comment
Open

How to handle inserting duplicate data? #53

RobStallion opened this issue Mar 4, 2019 · 1 comment
Assignees
Labels
discuss help wanted Extra attention is needed question Further information is requested

Comments

@RobStallion
Copy link
Member

relates to #45

We have created a function in the adapter for handling inserting data into the database. This function will create a cid and entry_id(which we are considering renaming to id, see here) based on the data passed in and adds them to the params passed to insert.

This will ensure that the cid and entry_id data is inserted into the database.

As a rule we don't want to insert duplicate data as it is a waste. We will use the cid value generated to check if the value is a duplicate and if so we can "reject" the insert.

This works well with the current set up. However, imagine we have a table user which has one column called username.

If our user is originally called batman but decides to change their name to the dark knight. This will work and we will insert a new row in the database.

Now the user decided that they want to change their name back to batman. If we try to insert this, we will get an error because we will be creating a duplicate cid value (which is obviously not what we want in this case).

How do we want to handle this case?

  • Should we create another column in the table that keeps track of the duplicate_no?

We could have some logic that checks to see if the data that is being inserted is the same as the previous version that was inserted. If it is we "reject" that insert. If not we increment the duplicate_no value in the db. If we use this value when creating the cid value, it would prevent the duplicate cids from being created. (We can come up with a better name than duplicate_no I'm sure. This was just an idea of a possible solution)

Please let me know your thoughts and if you have any other ideas about this.

@RobStallion RobStallion added help wanted Extra attention is needed question Further information is requested discuss labels Mar 4, 2019
@nelsonic
Copy link
Member

nelsonic commented Mar 7, 2019

@RobStallion really good question(s) here thank you for opening this issue to capture the discussion.

In the first instance we need to re-direct /batman to /darkknight so that people searching for /batman will find his new alias and see his profile. But when he decides to change back, the /darkknight should re-direct to /batman ...

Similar to how Prince (RIP) rebranded as
image "Symbol" (The Artist Formerly Known as Prince) ... https://en.wikipedia.org/wiki/Prince_(musician) and then a few years later changed back to "Prince".
If you search for "symbol musician" Prince' wikipedia comes up.

This is all "fine" for a couple of name/alias changes, but what happens if someone spots this "loophole" decides to programmatically change their name to all top 100k usernames on Twitter and all words in the dictionary, then a person could "own" the redirect for all the usernames and words i.e. every name redirects to "bob".

If we are using this feature in our own app(s) we will need to limit it's use to avoid this "hack".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants