Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partly Stale Worker Cache Can Cause Cycles in TreeMap/TreeSet Iteration #24

Open
tmagrino opened this issue Sep 20, 2018 · 8 comments
Open
Labels

Comments

@tmagrino
Copy link
Member

We've noticed this in the past before moving the repository to github and I've run into the issue again, so I'm documenting the problem here.

If I remember correctly, we determined in the past that this occurred because one node in the underlying red black tree backing TreeMap was stale in the worker's cache and pointed to another node that was updated to become it's parent (done during tree rotations to balance the tree). The parent node, unlike the child, is up to date in the worker cache, points to the stale child, causing the infinite loop.

There should be a way to avoid this causing an infinite loop. In the FabIL code that I'm working on which triggered this issue, I've come up with a bit of a hack solution to detect an issue and retry the transaction (and update the cache). I set up a local Java set to track items we've seen from the iterator already, throwing an exception if there's a duplicate. Here's a snippet of FabIL code demonstrating this "hack":

fabric.util.TreeSet s = new fabric.util.TreeSet().fabric$util$TreeSet$();
// ... code here adding things to the set s.
atomic {
  java.util.HashSet itemsSeen = new java.util.HashSet();
  for (fabric.util.Iterator iter = s.iterator(); iter.hasNext();) {
    Object item = iter.next();
    if (itemsSeen.contains(item)) throw new IllegalStateException("This shouldn't happen");
    itemsSeen.add(item);
    // Stuff we actually wanted to do in the loop goes here.
  }
}

This allows the transaction manager to check for stale objects on the store when going through the abort/retry loop.

Ideally, we could either ensure this can't happen to begin with or somehow build the invariant check I'm doing in the above snippet into the implementation of TreeMap.

@tmagrino tmagrino added the bug label Sep 20, 2018
@tmagrino
Copy link
Member Author

Adding an image depicting the sort of update that can cause this issue, found by @liujed, from past discussions of this problem.

treebug

@tmagrino
Copy link
Member Author

Might be worth considering rewriting TreeMap to use a balanced tree structure that's not dependent on rotations like a B-tree for our TreeMap implementation? I'm not quite sure this would be free of the issue, but it seems like it doesn't invite the issue like a red-black tree.

@tmagrino
Copy link
Member Author

tmagrino commented Sep 20, 2018

@liujed notes that we saw something like this with buckets in HashMap when resizing the map.

@andrewcmyers
Copy link
Contributor

A red-black tree can be implemented in a way that doesn't require imperative update to do tree rotations. See the CS 3110 notes.

@tmagrino
Copy link
Member Author

So the concern with doing a more "functional" version is that we don't have a way to delete persistent objects after they're put on the store, in Fabric.

@andrewcmyers
Copy link
Contributor

We could probably add that...

@tmagrino
Copy link
Member Author

I think this might be a big enough idea to merit a separate issue for discussion? I think that if we could find a general way to allow for temporary-yet-persistent objects to not lead to storage leaks on the store in Fabric, a lot of things would get better.

@tmagrino
Copy link
Member Author

Made an issue regarding the persistent garbage concern here: #25.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants