Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split the start state more conservatively when a parser contains decl initializers #4902

Merged
merged 6 commits into from
Sep 12, 2024

Conversation

kfcripps
Copy link
Contributor

@kfcripps kfcripps commented Sep 6, 2024

  1. Currently, the start state always gets split when a parser contains declaration initializations. The following:
parser ...
bit<16> local = 2;
state start {
    ...
}
...

always gets split into:

parser ...
bit<16> local = 2;
state start {
    ...
    transition start_0;
}

state start_0 {
    ...
}
...

However, this is only necessary when start is reachable from some other state(s). This PR makes it so that the split only happens when start is actually reachable from some other state(s).

  1. Coincidentally, also fixes MoveInitializers splits start state for parsers that do not need this #4901

@kfcripps kfcripps added the core Topics concerning the core segments of the compiler (frontend, midend, parser) label Sep 6, 2024
@mihaibudiu
Copy link
Contributor

Isn't there a pass which merges consecutive states that have no other in/out edges?
That could achieve the same effect.

Signed-off-by: Kyle Cripps <[email protected]>
@kfcripps
Copy link
Contributor Author

kfcripps commented Sep 6, 2024

I'd rather just avoid the split in the first place if possible (and it is easy enough to accomplish this in MoveInitializers). Some backends (and/or users) may not like when P4-visible states get merged by p4c (and such a pass would likely merge both p4c-generated and user-visible states).

if (someInitializers) {
// Check whether any of the parser's states loop back to the start state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a class which builds the reachability graph for parser states?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but that might help. @fruffy @vlstill ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

./frontends/p4/parserCallGraph.h

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CallGraph::reachable(start, reachable) will always include start in the set of reachable states, so this will not really help simplify things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can check the set of in-edge of the start state

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but frankly it would be useful to understand why the results are wrong. If there is a bug in the callgraph, it would be nice to know that. If the meaning of the isCalle() function is different than expected, it should be documented. If the rule for splitting is wrong, it would be good to know that too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihaibudiu Personally, I find it easier to initially analyze changes by looking at git diffs, which is why I shared the link to the diff with the C++ changes as well as corresponding updated tests outputs. If you prefer to initially analyze changes by checking out the changes locally, running the tests, and inspecting the tests failures locally, you can easily do this by recreating the changes minus tests outputs updates by fetching my fork and using git checkout commands.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but frankly it would be useful to understand why the results are wrong. If there is a bug in the callgraph, it would be nice to know that. If the meaning of the isCalle() function is different than expected, it should be documented. If the rule for splitting is wrong, it would be good to know that too.

If there's a problem with the ParserCallGraph, that is outside the scope of this PR, and I'm not volunteering to spend time debugging this. But others may feel free to do so :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am no longer working on this project, but I will try to find time to fix bugs that I am responsible for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihaibudiu For example: 4b0ec77#diff-cd2b40893d904a1015f5d71d39014e909e148f5db324e0ebb77484e99c3cc6c7R2-R7

Split is expected in this case because start is reachable from other states. So either I'm using ParserCallGraph incorrectly, or it's not working properly.

I am no longer working on this project, but I will try to find time to fix bugs that I am responsible for.

I did not realize you were no longer working on this project. Please do not feel obligated to fix even the bugs you are responsible for. But any help is of course appreciated :)

@kfcripps
Copy link
Contributor Author

@fruffy @vlstill Do you want to review this? If not I'll probably merge it soon.

Copy link
Collaborator

@fruffy fruffy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, I am not quite sure about the toMove logic since it is split across multiple visit methods which makes it harder to follow.

for (const auto *state : parser->states) {
const auto *selectExpr = state->selectExpression;
if (selectExpr == nullptr) {
// implicit reject transition
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use continue here instead?

// If this is the case, then the parser's decl initializers cannot be moved
// to the start state. A new start state that is never looped back to must
// be inserted at the head of the graph.
for (const auto *state : parser->states) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably could be a self-contained helper function which returns true if a loop is detected

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also make the control flow of the function simpler as you would just return on once you find any transtion to start.

return state;
}

const IR::Node *MoveInitializers::postorder(IR::Path *path) {
if (!oldStart || !loopsBackToStart || path->name != IR::ParserState::start) return path;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!oldStart || !loopsBackToStart || path->name != IR::ParserState::start) return path;
if (oldStart == nullptr || !loopsBackToStart || path->name != IR::ParserState::start) return path;

Copy link
Contributor

@vlstill vlstill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just a few code quality points.

frontends/p4/moveDeclarations.cpp Outdated Show resolved Hide resolved
frontends/p4/moveDeclarations.cpp Outdated Show resolved Hide resolved
// If this is the case, then the parser's decl initializers cannot be moved
// to the start state. A new start state that is never looped back to must
// be inserted at the head of the graph.
for (const auto *state : parser->states) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also make the control flow of the function simpler as you would just return on once you find any transtion to start.

frontends/p4/moveDeclarations.cpp Outdated Show resolved Hide resolved
frontends/p4/moveDeclarations.cpp Outdated Show resolved Hide resolved
Comment on lines 198 to 199
oldStart = nullptr;
loopsBackToStart = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this cleanup here? Would make more sense to do the cleanup when entering the parser so that it is clear what are initial values for each packet analysis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are already initialized in the MoveInitializers constructor. Are you suggesting to not initialize them there at all, or to initialize them twice (in both constructor and preorder function) the first time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess they are already being initialized twice for the first parser anyway, so I will move the initializations out of the constructor to avoid this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think initialization in preorder makes sense, whether to initialize also in constructor is up to you. Some linters may complain about unitialized class fields if the constructor does not initialize it (because they simply have no way of proving that preorder/init_apply/etc. runs before anything else that uses the var), but I think having no initializer in constructor makes a little more sense as the value is then overwritten anyway.

frontends/p4/moveDeclarations.h Outdated Show resolved Hide resolved
Comment on lines +14 to +18
parser Parser2() {
state start {
transition accept;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add some initializer here just to see that there is no confusion between the two parsers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to not change this test (it is the case taken from #4901) but I added a new one.

Signed-off-by: kfcripps <[email protected]>
Signed-off-by: Kyle Cripps <[email protected]>
Signed-off-by: Kyle Cripps <[email protected]>
Signed-off-by: Kyle Cripps <[email protected]>
@kfcripps kfcripps added this pull request to the merge queue Sep 12, 2024
@kfcripps
Copy link
Contributor Author

Thanks @vlstill @fruffy @mihaibudiu

Merged via the queue into p4lang:main with commit 9e77cde Sep 12, 2024
18 checks passed
@kfcripps kfcripps deleted the 4901-split-start-conservatively branch September 12, 2024 13:33
@asl
Copy link
Contributor

asl commented Sep 13, 2024

@asl
Copy link
Contributor

asl commented Sep 13, 2024

loopsBackToStart is not initialized

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Topics concerning the core segments of the compiler (frontend, midend, parser)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MoveInitializers splits start state for parsers that do not need this
5 participants