Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SFTP does not delete processed files #2

Open
slifty opened this issue Jul 5, 2019 · 2 comments
Open

SFTP does not delete processed files #2

slifty opened this issue Jul 5, 2019 · 2 comments

Comments

@slifty
Copy link
Contributor

slifty commented Jul 5, 2019

Right now if someone uploads to SFTP then we do not actually delete or move it after processing. This means that over time the SFTP buckets will just get bigger and bigger for no reason.

There is some value in this, since the user can see that the file has been uploaded.

Lets find a way to improve this process so we don't have files duplicated for long periods of time after proper processing.

@reefdog
Copy link

reefdog commented Jul 5, 2019

Out of curiosity, what would happen if a user changed the name of a file that was already on the server (and had already otherwise been processed)? Would that trigger a new import, or would a de-duper notice it was the same already-processed file and ignore it?

@isTravis
Copy link
Member

isTravis commented Jul 5, 2019

It seems that changing the metadata, key, etc of an s3 file creates a new copy of that file. We use the ObjectCreated event and this stackoverflow answer seems to suggest that ObjectCreated will be re-triggered. So, it seems that a rename would re-trigger processing.

That said, I believe the de-duper would catch it (the hash is generated on the file binary itself, not the s3 key name of the file).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants