Brainstorming subtree memoization. #12

jkelleyrtp · 2021-07-13T17:40:11Z

jkelleyrtp
Jul 13, 2021
Maintainer

There's a ton of ways to memoize subtrees in the LazyNodes calls. React or babel haven't implemented any of these yet, but theoretically, we would see very good performance on trees with few dynamic elements.

Many apps have lots of dynamic content, so they wouldn't benefit tremendously from these types of optimizations, however, it does save diffing various non-dynamic attributes.

Today

Today, Dioxus does a pointer-compare on attributes while diffing to memoize `static strs without having to compare them by value. This is a very light but effective form of subtree memoization because Rust does const/static merging in the .data section. IE two static strs will share the same pointer in release (and debug, but not always).

However, this isn't "true" subtree memoization - just an optimization that takes advantage of the fact that the same &'static strs share the same pointer. Even if all the pointer compares are the same, Dioxus will unnecessarily descend into the children and diff them too, even if the structure is the same every time. Ultimately, we waste a lot of branches and cycles constantly re-checking the same attributes.

There currently are flags wired up for is_static and propagating subtrees heuristics up during diffing, but we can't use them. When comparing two structures, we can only do eliminate the subtree if we are guaranteed the structure won't change between renders. However, this propagation fails when different structures are returned:

if val > 0.5 {
    rsx!(in cx, div {"a"}, div {"b"})
else {
    rsx!(in cx, div {"b"}, div {"a"})
}

Both of these structures will return as completely static, but are actually unique. If we skipped diffing because both were static, we're missing the point that they're entirely different structures.

The future

Here's a bunch of possible solutions. I'd like to use this discussion thread to brainstorm the best possible way to enable subtree diffing.

Idea: IDs per LazyNodes

One solution to the above problem is assigning a unique ID to every macro call. This would enable us to identify unique structures and only skip diffing them if their IDs are the same and they are static. This is slightly messy:

unnecessary IDs generated and inserted into the binary for every rsx! call
potential (very very rare) for an ID collision - bit of a nasty bug.
Doesn't exactly work with the factory API.

To understand #3, notice how someone using the factory API can circumvent these checks:

cx.render(LazyNodes::new("customid", move |f| {
    if val > 0.5 {
       f.element(div, &[], &[], &[], None)
    else {
       f.element(h1, &[], &[], &[], None)
    }
})

This is entirely valid and legal, however it is rather unlikely and can be seen as the "wrong" way to use Dioxus. Unfortunately, it is not simple to prevent. We also open ourselves up to potential memory safety issues by doing this memoization if Dioxus incorrectly memoizes structures that contain dynamic content. Memoization has to be guaranteed not to interfere with the management of lifetimes, so however this is implemented should try to rely on the compiler and type system rather than runtime heuristics.

Idea: IDs per element

It would be possible to implement the same solution as the IDs per macro call, but generate a custom compile-time ID for every VNode created. This solves some of the issues, but still suffers from things like dynamic children which can void consistency.

For this and the previous solution, the rsx! macro would have no trouble abiding by the rules, but the factory api would break down and potentially cause memory safety issues in safe code.

The obvious idea is to make subtree memoization unsafe. I'm leaning on avoiding using unsafe in macro calls like since it's harder to audit, harder to get right, and frankly a blunt hammer for a problem that can be solved in other ways.

Idea: Hashing the input itself.

This an idea I've seen presented in InfernoJS's discussion boards where the inputs themselves are hashed. IE you would hash the contents of the rsx! call at various levels, and compare the static hashes. We could provide unsafe methods that take the hash (effectively an ID with no collisions), but it's essentially the same idea as an ID per macro call.

Idea: creating a const version

This is an idea I've seen presented in Inferno's disucssion boards where a babel transform identifies static subtrees and creates a const version of the tree and copies it every time the JSX macro is used. In Rust, we would save the node to a const and clone it whenever we use it. I've experimented with this a bit, and it seems to work OK, but has weird copy-on-write issues. Consts in Rust get inserted directly to wherever they're called, so it's frankly no better than writing the structure out directly but with an ID.

Idea: using statics

This is the idea I like the most. Static subtrees are essentially collections of nodes with no listeners and no dynamic content. We would only skip diffing a static subtree if it were completely static, which can we can enforce at compile time by forcing nodes to only accept &'static. Due to how VNodes are currently declared, we can really only create static references, not values.

So a static element's children would be defined as such:

static ElChildren: &[VNode] = &[
    VNode::static_text("asd"),
    VNode::static_el("div", &[], &[]) // attributes, chldrne
];

We've essentially built a static list of children with a re-vistable static pointer, meaning we can perform a pointer compare on list of children. If the child list shares the same pointer, we can completely ignore diffing the children.

The one primary drawback here is the issue of interior mutability... We currently use the DomId flag to store the ID of the real mounted node in the virtual element itself. This is done through a Cell<RealDomNode>. Cell is not threadsafe, so we cannot even construct the VNode in a static context. Additionally, we would have to clone the vnode list anyways so two different structures can have different mounted nodes.

The way to solve this would be to separate the VNode from its mounted root completely. The constify PR is doing this, but I just haven't figured out the right pattern that completely detaches the two, because elements like fragments take a children field which is just a slice of VNodes - IE ones that are mounted.

Call for help

I've explored these solutions above, but this is just scratching the surface of what we can do with all the infrastructure of the Rust compiler. I'm curious what ideas the community has in improving our subtree memoization. All ideas welcome!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Brainstorming subtree memoization. #12

{{title}}

Replies: 0 comments

Select a reply

Brainstorming subtree memoization. #12

jkelleyrtp Jul 13, 2021 Maintainer

Today

The future

Idea: IDs per LazyNodes

Idea: IDs per element

Idea: Hashing the input itself.

Idea: creating a const version

Idea: using statics

Call for help

Replies: 0 comments

jkelleyrtp
Jul 13, 2021
Maintainer