Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch skylinks on /sweep #49

Open
ro-tex opened this issue Aug 9, 2022 · 0 comments
Open

Batch skylinks on /sweep #49

ro-tex opened this issue Aug 9, 2022 · 0 comments

Comments

@ro-tex
Copy link
Collaborator

ro-tex commented Aug 9, 2022

Overview

The POST /sweep action is rather slow (~2.25 hours for ~145k files on dev1). The main reason for that is that we need to lock each skylink individually and then update it individually. That's both many function calls but mostly many calls to MongoDB.

What we can do about this is add an optional batch mode which we can use when sweeping. When using that we can either lock the files individually (safer) or all at once (potentially disrupts* other servers). Once we have the files locked, we can update and unlock them in batches of 100 or even more.

*: The potential disruption to other servers is important because it will not just delay them until it's done. If we lock all skylinks and another server checks for skylinks it needs to work with it won't find any and it will assume that it's done with its sweep/scan and it won't perform a new one until the following day or a manual request. This is potentially problematic and it should be avoided.

Design or Proposal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant