-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partly Stale Worker Cache Can Cause Cycles in TreeMap/TreeSet Iteration #24
Comments
Adding an image depicting the sort of update that can cause this issue, found by @liujed, from past discussions of this problem. |
Might be worth considering rewriting TreeMap to use a balanced tree structure that's not dependent on rotations like a B-tree for our TreeMap implementation? I'm not quite sure this would be free of the issue, but it seems like it doesn't invite the issue like a red-black tree. |
@liujed notes that we saw something like this with buckets in HashMap when resizing the map. |
A red-black tree can be implemented in a way that doesn't require imperative update to do tree rotations. See the CS 3110 notes. |
So the concern with doing a more "functional" version is that we don't have a way to delete persistent objects after they're put on the store, in Fabric. |
We could probably add that... |
I think this might be a big enough idea to merit a separate issue for discussion? I think that if we could find a general way to allow for temporary-yet-persistent objects to not lead to storage leaks on the store in Fabric, a lot of things would get better. |
Made an issue regarding the persistent garbage concern here: #25. |
We've noticed this in the past before moving the repository to github and I've run into the issue again, so I'm documenting the problem here.
If I remember correctly, we determined in the past that this occurred because one node in the underlying red black tree backing TreeMap was stale in the worker's cache and pointed to another node that was updated to become it's parent (done during tree rotations to balance the tree). The parent node, unlike the child, is up to date in the worker cache, points to the stale child, causing the infinite loop.
There should be a way to avoid this causing an infinite loop. In the FabIL code that I'm working on which triggered this issue, I've come up with a bit of a hack solution to detect an issue and retry the transaction (and update the cache). I set up a local Java set to track items we've seen from the iterator already, throwing an exception if there's a duplicate. Here's a snippet of FabIL code demonstrating this "hack":
This allows the transaction manager to check for stale objects on the store when going through the abort/retry loop.
Ideally, we could either ensure this can't happen to begin with or somehow build the invariant check I'm doing in the above snippet into the implementation of TreeMap.
The text was updated successfully, but these errors were encountered: