-
Notifications
You must be signed in to change notification settings - Fork 7.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZOOKEEPER-4858: Remove the lock contention between snapshotting and the sync operation #2185
Conversation
@eolivelli can I get a review for this? Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are u sure it's safe just removing it?
takeSnapshot and restoreSnapshot should be two mutually exclusive operation, you probably need to introduce some more specific lock mechanism to avoid object-level locking.
ccdf4e5
to
03fe4ea
Compare
Thanks for reviewing it, @anmolnar. Yes, we would still want exclusive locking between snapshot and restore to make sure data tree is not replaced by restoring operation while snapshotting is in progress. |
added a specific lock just for snapshot and restore operations. @anmolnar would you mind taking a look at it. Thanks. |
@anmolnar I've addressed your comment. Can you take a quick look at it? Thanks. |
…he sync operation Author: [email protected]
03fe4ea
to
e432c64
Compare
Are you sure that we don't need to synchronize between snapshotting and |
synchronized (snapshotAndRestoreLock) { | ||
// set to the new zkDatabase | ||
setZKDatabase(newZKDatabase); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you better extend the scope of sync block to createSessionTracker()
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
protected void createSessionTracker() {
sessionTracker = new SessionTrackerImpl(this, zkDb.getSessionWithTimeOuts(), tickTime, createSessionTrackerServerId, getZooKeeperServerListener());
}
public File takeSnapshot(boolean syncSnap, boolean isSevere) throws IOException {
try {
synchronized (snapshotAndRestoreLock) {
snapFile = txnLogFactory.save(zkDb.getDataTree(), zkDb.getSessionWithTimeOuts(), syncSnap);
}
} catch (IOException e) {
public ConcurrentHashMap<Long, Integer> getSessionWithTimeOuts() {
return sessionsWithTimeouts;
}
takeSapshot()
and createSessionTracker()
of resore()
both read data from the sessionsWithTimeouts ConcurrentHashMap, any reason the sync block need to include it?
createSessionTracker()
is a post process after swapping the ZkDatabase object of a ZookeeperServer instance. sessionsWithTimeouts
is part of ZkDatabase
object and read/write sync is already covered by the sync block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be on the safe side. Learner
calls it within a sync block: https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java#L610
@kezhuw Thoughts? |
When has synchronized been added to take/restore snapshot? |
synchronized on the ZookeeperServer object was added when we introduced remote backup and restore feature, specifically #1943
Yes. I didn't know that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for pulling this too long. Lgtm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
it causes lock contention on the ZookeeperServer object with the sync operation.
Does anyone have any cue about why sync
get synchronized
? I saw that sync
is called only in one thread and pendingSyncs
is concurrent safe.
I have no idea either. I think we could remove synchronize from |
ZOOKEEPER-136 added synchronized to sync() |
|
…he sync operation Reviewers: anmolnar, kezhuw Author: li4wang Closes #2185 from li4wang/ZOOKEEPER-4858 (cherry picked from commit c0e9241) Signed-off-by: Kezhu Wang <[email protected]>
Author: Li Wang [email protected]