using multiple CellDb to concurrency read from celldb #1363
+145
−65
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
When using the TON liteserver built by ourself, we noticed that when the access count increases, the response time of some requests (such as GetAccountState) get slower, with many timeout errors.
After investigation, we found that the longest processing time for a single GetAccountState request is the scheduling of
CellDb::load_cell
. Below is a timing statistics we added for a specific GetAccountState request:We could see that
perform_getAccountState
cost 1463946μs totally, during whichCellDb::load_cell schedule
const 1400909μs. So the schedule ofCellDb::load_cell
wast most of the time.Fix
As we know, the task send to the same actor id is executed one by one. Since there's only one CellDb, so all the load cell operation will queued and executed one by one, but there are too many load cell operation waiting to be executed, that why the
CellDb::load_cell
schedule cost so much time.So the solution is clear, we increased the number of CellDb objects to allow
CellDb::load_cell
calls to execute concurrently.Result
Below is our test result (the test method involves sending 5000
GetAccountState
requests simultaneously, then recording the response time for each request, and finally calculating the number of timeout errors and the average response time).before optimization:
after optimization:
We can see that after enabling concurrent
CellDb::load_cell
calls, the average response time dropped from 3.8 seconds to around 1 second.