-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yul: Introduces ASTNodeRegistry #15823
base: develop
Are you sure you want to change the base?
Conversation
d252184
to
cd00616
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except the minor comments, it seems OK to me.
cd00616
to
2b9295d
Compare
@clonker, BTW, one thing that I found confusing was referring to things related to type |
2b9295d
to
bf80a01
Compare
You are right and this is what I was referring to on Monday :) I have made things self-consistent now as in everything is "id" and there is no more mention of "name". Change on the larger scope (as in changing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
But my review is rather superficial, I don't have enough knowledge about the design and how this is gonna be used.
Someone else should also have a look at this as well :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a complete review yet, just a few things I noticed so far.
libyul/ASTNodeRegistry.h
Outdated
namespace solidity::yul | ||
{ | ||
|
||
class ASTNodeRegistry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You described these classes in the PR description, but I'd much rather just have docstrings for them :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Especially m_idToLabelMapping
and m_labels
should be documented. Like, what are the numbers stored in m_idToLabelMapping
and the assumptions about them (can there be duplicates? do both vectors have to have same length?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added some docs - let me know what you think and/or if it should be expanded somewhere. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I still don't fully understand is the purpose of the ghost IDs. Looking at other PRs, I see the mechanism for adding them, but not yet any place that would actually add them. The intended usage is a detail that the docstring should explain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ghost nodes are added during CFG construction. They only live in the CFG itself and are not actually referenced in the AST. We could potentially also remove them here and specialize them out for CFGs. Then it's more local to where they are introduced and needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, yeah, their presence here threw me off because I initially thought you needed them for something in the AST and the naming from CFG was only a coincidence. So might be clearer to separate those.
On the other hand, now that it's properly documented and it's clear why we need it, I'd also be fine with leaving it here. It leaks CFG implementation details a bit, but I imagine that separating this out might complicate things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it would complicate things to completely separate it out. But it does work well to let the CFG render ghost labels so at least that detail doesn't leak.
03abf50
to
af57ab2
Compare
5ade700
to
4602bf7
Compare
libyul/ASTNodeRegistry.h
Outdated
/// unique labels in the container, beginning with empty ("") and ghost (ghostPlaceholder). | ||
std::vector<std::string> m_labels; | ||
/// Each element in the vector is one NodeId. The value of the vector points to the corresponding label. E.g., | ||
/// m_labels[m_idToLabelMapping[3]] is the label for NodeId 3. Therefore, there can be duplicates and the lengths | ||
/// of `m_labels` and `m_idToLabelMapping` do not need to correspond. | ||
std::vector<size_t> m_idToLabelMapping; | ||
mutable std::map<NodeId, std::string> m_ghostLabelCache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// unique labels in the container, beginning with empty ("") and ghost (ghostPlaceholder). | |
std::vector<std::string> m_labels; | |
/// Each element in the vector is one NodeId. The value of the vector points to the corresponding label. E.g., | |
/// m_labels[m_idToLabelMapping[3]] is the label for NodeId 3. Therefore, there can be duplicates and the lengths | |
/// of `m_labels` and `m_idToLabelMapping` do not need to correspond. | |
std::vector<size_t> m_idToLabelMapping; | |
mutable std::map<NodeId, std::string> m_ghostLabelCache; | |
/// Unique labels in the container; the first two items are: empty ("") and ghost (ghostPlaceholder). | |
std::vector<std::string> m_labels; | |
/// Each index in the vector is one NodeId. The value of the vector points to the corresponding label. E.g., | |
/// m_labels[m_idToLabelMapping[3]] is the label for NodeId 3. Therefore, there can be duplicates and the lengths | |
/// of `m_labels` and `m_idToLabelMapping` do not need to correspond. | |
std::vector<size_t> m_idToLabelMapping; | |
mutable std::map<NodeId, std::string> m_ghostLabelCache; |
Though TBH this still does not say everything I wanted to know when I started reviewing this. That's how I'd put it myself:
/// unique labels in the container, beginning with empty ("") and ghost (ghostPlaceholder). | |
std::vector<std::string> m_labels; | |
/// Each element in the vector is one NodeId. The value of the vector points to the corresponding label. E.g., | |
/// m_labels[m_idToLabelMapping[3]] is the label for NodeId 3. Therefore, there can be duplicates and the lengths | |
/// of `m_labels` and `m_idToLabelMapping` do not need to correspond. | |
std::vector<size_t> m_idToLabelMapping; | |
mutable std::map<NodeId, std::string> m_ghostLabelCache; | |
/// All Yul AST node labels present in the registry. | |
/// Always contains at least two items: an empty label and a ghost placeholder. | |
/// All items must be unique. All but the first two must be valid Yul identifiers. | |
std::vector<std::string> m_labels; | |
/// Assignment of labels to `NodeId`s. Indices are `NodeId`s and values are indices into `m_labels`. | |
/// Every label except ghost placeholder always has exactly one `NodeId` pointing at it. | |
/// Ghost placeholder can have more than one. | |
std::vector<size_t> m_idToLabelMapping; | |
/// Artificial labels generated for ghost IDs from a template. | |
/// Generated on demand through `lookupGhost()` and cached for future lookups. | |
/// Must never contain non-ghost IDs. Labels are guaranteed to be unique. | |
mutable std::map<NodeId, std::string> m_ghostLabelCache; |
But note that it includes some of my assumptions how it should work, which may or may not be true depending on the answers to my earlier comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I think we should make m_labels
and m_idToLabelMapping
const
. The container is immutable so they're not supposed to ever change after initialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I think we should make
m_labels
andm_idToLabelMapping
const
. The container is immutable so they're not supposed to ever change after initialization.
That'll get rid of implicit copy/move, though. The immutability is reflected by it not having any non-const methods. That would already take care of that unless one const-casts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But won't that be just for assignments and still give us constructors? I mean, assigning the registry in place kinda violates its immutability so not having them actually makes sense.
In any case, not a big deal, we could leave them non-const
, but I still don't see why it wouldn't make sense :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can revisit that once the rest is in place, there are some cases in which not being able to assign it would require additional refactoring because of order of initialization and dependencies. If it's fine for you I'd leave it non-const for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
TBH I don't mind this not being const
as much as the fact that we're going rely on overwriting whole registries as a workaround for initialization order. That sounds like it could backfire :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeahh... Made them const
after all. :)
libyul/ASTNodeRegistry.h
Outdated
namespace solidity::yul | ||
{ | ||
|
||
class ASTNodeRegistry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I still don't fully understand is the purpose of the ghost IDs. Looking at other PRs, I see the mechanism for adding them, but not yet any place that would actually add them. The intended usage is a detail that the docstring should explain.
837871f
to
43a00c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have most things ironed out, the most important remaining one being the need for text representation for ghost labels. Also docs/naming for those "unused" labels.
Other than that I have some smaller nitpicks from the final pass trough the code.
After that we'll be done here.
b0f5c6b
to
2d04f51
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going through the PR again I found some outdated bits that need to be cleaned up before we merge (like mentions of the now non-existent ghost placeholder or Id
instead of ID
), but I have no more issues with how it's implemented so I'm already approving.
|
||
ASTLabelRegistry::LabelID define(std::string_view _label); | ||
ASTLabelRegistry::LabelID addGhost(); | ||
ASTLabelRegistry build() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in #15823 (comment), please add a docstring explaining what gets copied. In particular that the unused labels do not carry over, but ghosts do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added such docstring. Unused label ids do carry over, as the m_nextID
is based on the existing registries' maxID
- which includes them.
77c2bb0
to
44cace1
Compare
44cace1
to
f330e45
Compare
This PR supersedes PR #15242.
Depends on PR #15821.It introduces
yul::ASTNodeRegistryBuilder
andyul::ASTNodeRegistry
. Most important bits of the API:I have not yet included the
NodeIdDispenser
, this can be done in a separate PR.The
ASTNodeRegistry
lives (immutably) in theAST
and can be used to query labels for particular nodes in the AST. But potentially this could store more metadata than just string labels.A
ASTNodeRegistry
can be created by using theASTNodeRegistryBuilder
- which would be the preferred way when importing or parsing (see PR #15833):solidity/libyul/AsmJsonImporter.cpp
Lines 82 to 87 in c2cea37
solidity/libyul/AsmJsonImporter.cpp
Lines 56 to 60 in c2cea37
Or, during/after optimization, from the
NodeIdDispenser
based on the root block, discarding everything that is not part of the current AST, and the dialect to check for collisions with reserved identifiers:solidity/libyul/optimiser/Suite.cpp
Line 200 in c2cea37
solidity/libyul/AST.cpp
Lines 110 to 114 in c2cea37