Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add trace_object_immediately to EdgeVisitor #865

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

qinsoon
Copy link
Member

@qinsoon qinsoon commented Jul 6, 2023

This PR adds trace_object_immediately to EdgeVisitor. This allows the binding to deal with some fields specially during object scanning. This allows flexibility during object scanning for the bindings, and they can choose how to deal with each edge.

The motivation is that Julia objects may have some fields that are internal references. When scanning the object, we can compute the offset from the internal reference to the actual object reference. Currently we are using enum JuliaEdge { Simple(SimpleEdge), Offset(OffsetEdge) } and struct OffsetEdge { slot, offset } to deal with this: the OffsetEdge knows the slot to load from and store to, and knows how to compute the actual object reference from the internal reference in the slot and the offset. However, this approach has drawbacks: 1. makes enum JuliaEdge large (2 words + enum tag). As we store a lot of edges, and we pass edges around in MMTk core. Using a large struct for edges has overhead. Considering OffsetEdge is just an infrequent case (maybe 10% in the workload I am looking at), paying cost penalty for it seems unnecessary. 2. we need to check which edge type it is all the time. This also imposes overhead. So I am trying to just use SimpleEdge(Address) for Julia. To achieve this, I need a different method from EdgeVisitor that I can use to deal with internal references. trace_object_immediately works for this case.

@wks
Copy link
Collaborator

wks commented Jul 6, 2023

I think what we need is something between Scanning::scan_object and Scanning::scan_object_and_trace_edges. The former only allows enqueuing edges, while the latter only allows visiting edges immediately.

This PR essentially creates an API that allows both, and lets the VM decide which edge to enqueue, and which edge to process immediately.

However, my concern is that MMTk needs an API that simply visits edges. I had a lengthy blog post (https://wks.github.io/blog/2022/05/16/fifteen-years-transitiveclosure.html) about how a simple object-scanning API (the equivalent of EdgeVisitor) eventually became entangled with other purposes (i.e. TransitiveClosure). If we want to design a contract between MMTk core and the VM binding for the specific purpose of letting the VM binding decide the most efficient way to handle an edge, we should make such a API explicit.

One alternative to this PR is, instead of adding methods to EdgeVisitor, we provide another method in Scanning, something like

fn scan_object_and_enqueue_or_trace_edges(
    object: ObjectReference,
    edge_enqueuer: impl EdgeVisitor,
    object_tracer: impl ObjectTracer
) -> ObjectReference

which lets the VM binding call either edge_enqueuer or object_tracer.

Other applications of this PR

By the way, I think this change has more applications than handling OffsetEdge's.

When scanning an array, we can visit its elements immediately instead of enqueuing the edges (or edge ranges) for later process. This is applicable to Julia, Ruby and OpenJDK alike.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants