DeltaBundler GC and HMR performance #1514

motiz88 · 2022-05-16T21:11:11Z

motiz88
May 16, 2022

I've been working on bugfixes for DeltaBundler (the core set of incremental build algorithms within Metro), see e.g. facebook/metro#819. In the course of figuring out how it all works I've identified an opportunity to simplify the code, root out some lingering edge cases, and possibly even gain performance benefits for HMR / Fast Refresh. This is partly a brain dump and partly an RFC on what I want to do next.

Context: Cycle detection in traverseDependencies

The main traversal algorithm (traverseDependencies --> processModule) is a DFS whose purpose is to transform all potentially reachable modules, update the graph based on the discovered dependencies, and compute the "delta" from the previous graph (the sets of modules added, deleted and modified since the last traversal).

Originally, modules would be removed from the graph only when their set of inverse dependencies became empty during the main traversal. Back in facebook/metro#506, we added a form of cycle detection in an attempt to guarantee that unreachable modules would be removed even in the presence of cycles. This took the form of kicking off sub-traversals to scan for cycles "backwards" whenever we deleted an edge from the graph.

While very useful and powerful, the algorithm we ended up with - multiple mutating backwards sub-traversals inside a mutating forwards main traversal - is hard to reason about. Even if the implementation is fully correct, just keeping it that way is a challenge. Add to this the nondeterminism of filesystems and parallel transformer workers, and this becomes a pretty difficult area of the Metro codebase.

Recently, I had the insight that a big part of the problem that traverseDependencies solves is analogous to garbage collection. Could we evaluate standard GC approaches and see if they had any benefits?

Theoretical framework: Garbage collection

First, to lay out the analogy: modules in Metro's mutable Graph data structure are like objects in a heap, dependencies between them are references, and entry points are roots (objects that are always live by definition). After a mutation, we want to know precisely[1] which modules are no longer reachable from the entry points, and delete them. The original DeltaBundler started out with just reference counting[2] (which by its own has a well-known problem with cycles), our homegrown cycle detection logic can be seen as a form of tracing, and together they form a GC algorithm.

[1] "Precisely" is a hint that we're looking for a "stop-the-world" GC as opposed to an incremental one, because we want a strong guarantee that all the garbage has been collected at a specified point in time.
[2] Technically, inverse dependency tracking, which for our purposes is just reference counting with extra metadata.

A Unified Theory of Garbage Collection is a paper that presents a great summary of standard GC algorithms, their similarities and differences. Of those, Reference Counting with Trial Deletion seemed very appropriate as a way to restructure DeltaBundler's algorithm at a reasonable level of effort: we could keep the basic refcounting implementation and just focus on implementing a new cycle collector, replacing facebook/metro#506.

The algorithm is presented in greater detail in Concurrent Cycle Collection in Reference Counted Systems as the Synchronous Cycle Collection algorithm. Briefly, the idea is that instead of immediately scanning for cycles during a mutation, we can queue these scans whenever a module's reference count is decremented without reaching zero, making the module a "potential cycle root". (When it does reach zero, the module is definitely unreachable and can be deleted immediately.) We can process this queue at any time, and when we do, we can do it in O(nodes+edges) by treating the transitive closure of all potential cycle roots as a single graph.

The O(nodes+edges) complexity is of course comparable to doing a full DFS from the entry points to determine reachability (a classic "mark and sweep"). As they say: "not great, not terrible". <hand-waving>The nice thing about keeping track of potential cycle roots is that if they're typically "far" from the entry point, and there aren't huge cycles involving ~the whole bundle, we can often get away with traversing a much smaller subgraph in practice.</hand-waving> So this seems to meet the basic goals of our current cycle collector (minimal perf impact while guaranteeing correctness), and meets our architectural goals (cleanly separating the main traversal from the GC algorithm).

Bonus: When should GC run?

The approach described above successfully decouples the main traversal from the GC pass (with good performance for the latter). Still, the way DeltaBundler currently works, we'd need to always perform a GC pass immediately to guarantee that unreachable modules are deleted. This opens up an interesting question: when do we actually need to guarantee this? Can we get away with skipping GC passes sometimes?

It seems that we have some cases that require a GC and some that don't:

Requesting a full bundle from Metro (or building one offline): definitely needs a GC as bundles should not include dead code.
Building an HMR update to send to the client in response to a filesystem event: I think we can get away with never performing GC here, or potentially doing it periodically, or in the background while Metro is idle. (The latter can complicate the HMR protocol and/or DeltaBundler's implementation.)
1. One interesting exception is that when we encounter an error (e.g. syntax error or resolver error) in a module, it might not actually be part of the graph, so we should probably avoid blocking the user on fixing the error.[3] A possible solution is to catch the error, run a GC, and if the GC wasn't a no-op, retry the main traversal. (We'd then either hit the same error again and this time let it through, or not hit the error because the module has been GC'd.)

[3] Note that this is another problem in the current DeltaBundler! At the moment, removing a bad module from the graph is not sufficient to clear a syntax error that's blocking Fast Refresh.

Code (WIP)

I am working on the GC algorithm over at facebook/metro#820. (There be dragons in this branch!) Assuming it works out, I will later share another branch with the change that makes GC optional.

Future work

We should really have some kind of benchmarking framework for Metro - would be nice to objectively measure the concrete performance impact of a change like this. At any rate, I am more interested in the correctness and maintainability benefits of this architectural change, and any perf benefits are secondary (as I don't think Metro's HMR performance is currently an issue on its own).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeltaBundler GC and HMR performance #1514

{{title}}

Replies: 0 comments

Select a reply

DeltaBundler GC and HMR performance #1514

motiz88 May 16, 2022

Context: Cycle detection in traverseDependencies

Theoretical framework: Garbage collection

Bonus: When should GC run?

Code (WIP)

Future work

Replies: 0 comments

motiz88
May 16, 2022