Skip to content

Simulate impact of shard movement using shard-level write load #131406

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

nicktindall
Copy link
Contributor

@nicktindall nicktindall commented Jul 17, 2025

I've been back and forth on this a bit, but I think going for something simple is best. When we start receiving shard write load estimates from the nodes that should be able to plug those in and this should "just work" (assuming I've understood shard-level write load correctly.

We ignore queue latency in the modelling because I don't think we're going to look at it in the decider, and I can't see how we could estimate how it would change in response to shard movements (it's a function of the amount the node is overloaded AND how long it's been like that, and back-pressure should ideally keep a lid on it).

@nicktindall nicktindall added >non-issue :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Jul 17, 2025
@elasticsearchmachine elasticsearchmachine added Team:Distributed Coordination Meta label for Distributed Coordination team v9.2.0 labels Jul 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@nicktindall nicktindall requested a review from mhl-b July 17, 2025 06:11

public class WriteLoadPerShardSimulator {

private final ObjectFloatMap<String> writeLoadDeltas;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: simulatedNodesLoad?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I changed to simulatedWriteLoadDeltas, we only store the delta from the reported/original write load here. The idea there is that if no delta is present, we can just return the original NodeUsageStatsForThreadPools instance.

}
}
writeShardsOnNode.forEach(
shardId -> writeLoadPerShard.computeIfAbsent(shardId, k -> new Average()).add(writeUtilisation / writeShardsOnNode.size())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you equally divide write-load across all write shards on node?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is just a stop-gap until we get actual shard loads, which should work as a drop-in replacement.

Copy link
Contributor

@mhl-b mhl-b Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I was thinking maybe we should have some heuristic from already available data. Otherwise signal/noise ratio is too high. It's not uncommon to have hundreds of shards, and estimation has little to no impact on a single shard.

For example use shardSize heuristic, the larger size more likely it would have write-load. Lets say linearly increase weight of those shards as size approaches 15GB. And then decrease weight as they approach to 30GB since we would roll-over them (most of the time) if size <15GB then size/15GB else max(0, 1-size/30GB)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll have actual shard write loads shortly. Hopefully we can avoid all this guessing entirely.

#131496

Copy link
Contributor

@mhl-b mhl-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nicktindall
Copy link
Contributor Author

I might hold off merging until we get #131496 merged, I think we can avoid fudging the shard write loads

@nicktindall nicktindall changed the title Estimate impact of shard movement using node-level write load Simulate impact of shard movement using shard-level write load Jul 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >non-issue Team:Distributed Coordination Meta label for Distributed Coordination team v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants