Skip to content

Conversation

graehl
Copy link

@graehl graehl commented Aug 30, 2022

At the cost of exposing some existing detail, enable efficient
unsharing of graph structure (DeepClone), and efficient ShallowClone.

Modify[Key]Values allows in-place modification (or deletion) of
node_data in a Map. (the Key is not modified but provided so the
modification of the value may depend on the key)

UnshareSubtrees for in-place DeepClone copying of
shared subtrees.

id and hash_value allow creating efficient input Node -> output Node
transformations where the result of a previously transformed subtree
may be reused (allowing linear time creation of output trees with
shared subtrees instead of exponential by

Rationale: although YAML::Node faithfully represents the input
document and allows efficient traversal, using it as a large-scale
configuration / AST means that it's easy to unintentionally create
quadratic or worse algorithms modifying the tree.

At the cost of exposing some existing detail, enable efficient
unsharing of graph structure (DeepClone), and efficient ShallowClone.

Modify[Key]Values allows in-place modification (or deletion) of
node_data in a Map. (the Key is not modified but provided so the
modification of the value may depend on the key)

UnshareSubtrees for in-place DeepClone copying of
shared subtrees.

id and hash_value allow creating efficient input Node -> output Node
transformations where the result of a previously transformed subtree
may be reused (allowing linear time creation of output trees with
shared subtrees instead of exponential by

Rationale: although YAML::Node faithfully represents the input
document and allows efficient traversal, using it as a large-scale
configuration / AST means that it's easy to unintentionally create
quadratic or worse algorithms modifying the tree.
@graehl
Copy link
Author

graehl commented Aug 30, 2022

Several of these changes can be made separate pull requests if desired. Tests passed locally and we have our fork is in production successfully. If there's only some subset that you'd like to maintain, we're prepared to keep the fork going, but ideally this work may be useful to others. Also happy to add docs.

@jbeder
Copy link
Owner

jbeder commented Sep 20, 2022

This is very interesting, thanks for the PR.

A couple high level-comments that we should work out before going into detail.

  1. It's worth being explicit about the problem. Could you give an example (of the bad behavior) in a few lines of code? Or maybe create a github gist with several examples?

  2. I'd prefer a factory pattern rather than constructors-with-tags. E.g. Node ShallowClone(Node) kinda thing.

  3. The API exposes too many internal details. The Node class in particular; I'd rather not add any unnecessary public methods. Can you rework it so that's the case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants