-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory store leader election - REAP-2 #1533
Conversation
@adejanovski this is very tasty but your CI is failing at the moment, and can you remind me how to manually test this :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've put a bunch of feedback as a first review. I'll likely want to consider this more - this isn't an easy PR so nice work on getting something together. It feels like its very close.
I have a bunch of comments around naming since I had a bit of a tricky time understanding some of what was going on at first. These are always nits, but given the functionality is complex I'd love to see some of them straightened out for future readers.
I also think we could handle the locking we're doing much better, and suggest we revisit that now given some of the use cases we want to put this into. For a first cut this is great, but I think a bit of improvement there might do us a world of good. Improvements might include:
- Better segmentation of what needs to be locked, since the global lock you're using isn't scalable.
- Being more discerning about where you're locking. You probably only need to lock on writes, but you're often acquiring the lock before reads at entry into the method.
Again, this is kind of a minor tweak but I think it'll avoid headaches down the road.
src/server/src/main/java/io/cassandrareaper/storage/MemoryStorageFacade.java
Outdated
Show resolved
Hide resolved
try { | ||
long currentTime = System.currentTimeMillis(); | ||
// Check if any replica is already locked by another runId | ||
for (String replica : replicas) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think the pattern in this codebase is to the use the streams API instead of for loops where possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair point, I'm refactoring this to use streams.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you might have missed this one in your latest commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're good now, no more loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there is still a loop here for (String replica : replicas) {
was this something you wanted to change?
src/server/src/main/java/io/cassandrareaper/storage/memory/ReplicaLockManagerWithTtl.java
Outdated
Show resolved
Hide resolved
src/server/src/main/java/io/cassandrareaper/storage/memory/ReplicaLockManagerWithTtl.java
Show resolved
Hide resolved
src/server/src/main/java/io/cassandrareaper/storage/memory/ReplicaLockManagerWithTtl.java
Show resolved
Hide resolved
src/server/src/main/java/io/cassandrareaper/storage/memory/ReplicaLockManagerWithTtl.java
Show resolved
Hide resolved
src/server/src/main/java/io/cassandrareaper/storage/memory/ReplicaLockManagerWithTtl.java
Show resolved
Hide resolved
src/server/src/main/java/io/cassandrareaper/storage/memory/ReplicaLockManagerWithTtl.java
Outdated
Show resolved
Hide resolved
src/server/src/test/java/io/cassandrareaper/service/RepairRunnerHangingTest.java
Outdated
Show resolved
Hide resolved
src/server/src/main/java/io/cassandrareaper/storage/memory/ReplicaLockManagerWithTtl.java
Outdated
Show resolved
Hide resolved
|
||
private final ConcurrentHashMap<String, LockInfo> replicaLocks = new ConcurrentHashMap<>(); | ||
private final ConcurrentHashMap<UUID, Set<UUID>> repairRunToSegmentLocks = new ConcurrentHashMap<>(); | ||
private final ConcurrentHashMap<UUID, ReentrantLock> runIdLocks = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue: I thought you said you weren't going to divide the locks up by runID? 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I tried but then rolled back. Not far enough it seems :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you want to roll this back? I thought we were reverting to using a global lock.
src/server/src/main/java/io/cassandrareaper/storage/memory/ReplicaLockManagerWithTtl.java
Show resolved
Hide resolved
I'm starting to struggle with the length of this review, so let's propose a sketch to do just one method without external locks:
This also requires an update:
What this does:
|
From discussions with Alex earlier:
I feel like there's some concurrent bimap that might resolve this whole state, but given that I've been reminded that we will currently only be managing one cluster per instance in the proposed use case we have, we can probably defer this more complicated implementation for later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've put more comments as it seems you've said that you've made certain updates in comments but then they aren't all reflected in the code. You may have rolled some back accidentally while rolling back other changes (esp the use of a map of locks).
I'm approving anyway since the tests pass and I don't want you to be blocked from merging pending my review. Whether you make the rest of the changes we've discussed is up to you, since I think they are all non-blocking and this is functionally correct AFAICT.
.map(replica -> replicaLocks.get(getReplicaLockKey(replica, runId))) | ||
.anyMatch(lockInfo -> lockInfo != null | ||
&& lockInfo.expirationTime > currentTime && lockInfo.runId.equals(runId)); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue: This looks wrong, you're saying to return false
only if the value of this entry has runId
equal to the runId
you're trying to lock. Don't you want to return false
regardless of who holds the lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, we allow concurrency on replicas across repair runs. Its within a repair run that we disallow it.
src/server/src/main/java/io/cassandrareaper/storage/memory/ReplicaLockManagerWithTtl.java
Show resolved
Hide resolved
|
||
private final ConcurrentHashMap<String, LockInfo> replicaLocks = new ConcurrentHashMap<>(); | ||
private final ConcurrentHashMap<UUID, Set<UUID>> repairRunToSegmentLocks = new ConcurrentHashMap<>(); | ||
private final ConcurrentHashMap<UUID, ReentrantLock> runIdLocks = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you want to roll this back? I thought we were reverting to using a global lock.
src/server/src/main/java/io/cassandrareaper/storage/memory/ReplicaLockManagerWithTtl.java
Show resolved
Hide resolved
src/server/src/main/java/io/cassandrareaper/storage/memory/ReplicaLockManagerWithTtl.java
Show resolved
Hide resolved
try { | ||
long currentTime = System.currentTimeMillis(); | ||
// Check if any replica is already locked by another runId | ||
for (String replica : replicas) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there is still a loop here for (String replica : replicas) {
was this something you wanted to change?
Fixes #1519
The memory storage backend didn't implement any if the
IDistributedStorage
leader election methods.This was ok until we started using leader election for segment scheduling, which doesn't require to run in a distributed mode.
As a consequence, Reaper using the mem store would schedule one new segment at each poll, even if the replicas were already busy processing another segment.
This PR introduces a class which manages locks on replicas for segments and moves the required methods from
IDistributedStorage
toIStorage
so they can be implemented in the memory storage implementation.