Optimise WorldStatisticsProvider regionising #506
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
By using BFS and a chunk coordinate->chunk map, we can regionise in linear time - as opposed to quadratic. We also can remove the merging logic, as using a BFS guarantees that all adjacent chunks are in the same region.
The logic essentially works by selecting any chunk, and then BFSing all of its adjacent neighbours into a region and repeating until no more chunks exist. The coordinate map facilitates the initial chunk selection and BFS marking on top of the lookup for adjacent neighbours. The coordinate map is linked so that the initial chunk selection is constant-time.
I noticed that this part of the code was taking excessive time while testing Moonrise on 1.21.5 with ~100k chunks loaded (using the larger view distance startup flag), as the old regionising code was quadratic (>60% of the upload time). With the new regionising code, it was reduced to <1% for the upload time.
The adjacent distance checks are identical to the old code, so this will produce the exact same regions as before.
If fastutil were in the common package, we could remove the added Coordinate class and just the long key directly. Using a Long in HashMap is not a solution, as the hashcode becomes
x ^ z
which has many duplicates.