Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improve datanode snapshot creation #11396

Draft
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

jeremyletang
Copy link
Member

As of now, the snapshot are created in a sequential and blocking way in the datanode. This means that while a snapshot is being taken, no block can be processed.

The following approach is made:

  • the database is locked with a transaction
  • queries are generated
  • one by one the query are:
  • executed
  • the result piped into the file system
  • finally the lock is released, and later the files are added to ipfs.

The bottle neck here is that the results are being save on the fs as they arrive, which is unecessary and amount for 95% of the time spent snapshoting (and so blocking anything else).

To prevent this, we keep those results from the database in buffers, and only save them to file via a worker go routine.

@jeremyletang jeremyletang force-pushed the feature/improve-datanode-snapshot-creation-time branch from 08f9a31 to 7b3aa5f Compare June 19, 2024 16:42
@jeremyletang jeremyletang self-assigned this Jun 24, 2024
@jeremyletang jeremyletang force-pushed the feature/improve-datanode-snapshot-creation-time branch 7 times, most recently from 9098e67 to 59b88d6 Compare June 25, 2024 10:03
@jeremyletang jeremyletang force-pushed the feature/improve-datanode-snapshot-creation-time branch 3 times, most recently from 0eadadc to 3978470 Compare July 2, 2024 18:36
As of now, the snapshot are created in a sequential and blocking way in the datanode. This means
that while a snapshot is being taken, no block can be processed.

The following approach is made:
- the database is locked with a transaction
- queries are generated
- one by one the query are:
 - executed
 - the result piped into the file system
- finally the lock is released, and later the files are added to ipfs.

The bottle neck here is that the results are being save on the fs as they arrive, which is unecessary
and amount for 95% of the time spent snapshoting (and so blocking anything else).

To prevent this, we keep those results from the database in buffers, and only save them to file via a worker
go routine.

Cache buffer size in datanode snapshot.

Signed-off-by: Jeremy Letang <[email protected]>
@jeremyletang jeremyletang force-pushed the feature/improve-datanode-snapshot-creation-time branch from 3978470 to 4e2961c Compare July 2, 2024 22:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

None yet

2 participants