Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filemanager: assuming Object Removed events occur when an object is overwritten #131

Closed
mmalenic opened this issue Mar 4, 2024 · 1 comment · Fixed by #136
Closed

filemanager: assuming Object Removed events occur when an object is overwritten #131

mmalenic opened this issue Mar 4, 2024 · 1 comment · Fixed by #136
Assignees
Labels
bug Something isn't working filemanager an issue relating to the filemanager

Comments

@mmalenic
Copy link
Member

mmalenic commented Mar 4, 2024

The current filemanager implementation assumes that an Object Removed event occurs when an object is replaced. However, this is not necessarily the case, and an object can be replaced using PutObject without triggering a Removed event. This causes a "duplicate key value violates unique constraint" error when trying to ingest the replaced object.

@mmalenic mmalenic self-assigned this Mar 4, 2024
@mmalenic mmalenic added filemanager an issue relating to the filemanager bug Something isn't working labels Mar 4, 2024
@mmalenic
Copy link
Member Author

mmalenic commented Mar 5, 2024

Two issues in the filemanager code related to this:

  • I was a bit too quick with fix(filemanager): correctly deal with null version_id #125. While it's true that the version_id should have null values that are not treated as distinct, this does not apply to the sequencer values, and therefore the nulls not distinct on both version_id and sequencer produces incorrect results. This should make sense, because the version_id participates in the unique constraint where a null value is indeed a valid state, i.e. versioning is not enabled. However, the sequencer values only represent an order in the actual events, and should not participate in the unique constraint in the same way, i.e. they are only useful once they have values, and null just indicates that the information is missing for now.
    • I think the cleanest way to fix this is to make the version_id not null and have it be a specific value if versioning is not enabled.
  • This select statement can return multiple rows, which is incorrect. For example, entries with a created sequencer of '1' and '2' both match this condition: current_objects.created_sequencer < current_objects.input_deleted_sequencer when the deleted sequencer is '3'. However, the only entry that should update is the one with '2', because this is the closest created sequencer to the deleted sequencer.
    • This can be fixed with an order by and limit 1 to select a single row.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working filemanager an issue relating to the filemanager
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant