-
Notifications
You must be signed in to change notification settings - Fork 294
transpile: Always explicitly reference current_block values
#1556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
bb10696 to
3739969
Compare
3739969 to
4609106
Compare
randomPoison
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello! First off, thank you for putting this together 💜 For the most part this looks good, I just have a couple of comments about how we can simplify things and make the output a bit easier to work with.
Unfortunately, I've currently got a PR up that heavily reworks the relooper implementation (#1558), and I think it would be best to wait to merge this until after that refactoring lands. The changes in that PR shouldn't conflict too heavily with the changes you have here from what I can see.
| ), | ||
| GotoTable { | ||
| cases: Vec<(L, StructuredAST<E, P, L, S>)>, | ||
| then_lbls: Vec<L>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to keep a set of labels for the then arm. iiuc you're combining all of these as an | pattern in order to satisfy the exhaustiveness requirement. But this has the drawback that all of the unused labels show up in the generated match, which means that if you're looking at some code that does current_block = Foo; and you want to find where Foo is used, it's going to show up in match current_block expressions where it's not actually a valid arm. I think it'd be better to always use _ as the then pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is about adding explicit references in the match arms. If I use _ instead, doesn't that just keep the original situation? Or am I misunderstanding something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal of #1522 is that the _ case will either be unreachable!() if all entries are present in the list of branches, or empty ({}) if there are unhandled entries. That latter case comes from fallthrough patterns in the original C, e.g.
switch (x) {
case 0:
x++;
case 1:
x++;
case 2:
x++;
}With your PR we get a match that looks like this:
match current_block_2 {
12729336616994636022 => {
x += 1;
current_block_2 = 6386683298811446793;
}
6386683298811446793 | 17778012151635330486 => {}
_ => unreachable!(),
}What I intended with #1522 would be more like this:
match current_block_2 {
12729336616994636022 => {
x += 1;
current_block_2 = 6386683298811446793;
}
_ => {},
}That is, all of the handled entries show up as match arms, but if there are unhandled entries we should still have an empty wildcard case. So my comment was that we shouldn't need a pattern for the then of the goto table because the then pattern should always be _, and the only difference is whether the body of that arm is {} or unreachable!().
There is an advantage to your approach, which is that current_block values that are never handled still show up explicitly in a {} arm, but the changes in #1558 make that irrelevant because we no longer generate match current_block in those cases. With #1558 I think we will only be generating match current_block in cases where all entries to the Multiple are handled.
| GotoTable { | ||
| cases: Vec<(L, StructuredAST<E, P, L, S>)>, | ||
| then_lbls: Vec<L>, | ||
| then_body: Box<StructuredAST<E, P, L, S>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need the then_body to be a full AST node. There are only 3 possible cases:
- No then branch, i.e. we cover all cases and already satisfy exhaustiveness. This happens if there's only one
match current_blockin the function. - Unreachable then branch, i.e. our
matchdoesn't cover all of the labels. This happens if we have multiplematch current_blockin the function, since each will have different labels.- Aside: Down the line we may want to handle this by generating a different enum for each
match current_block, but I think starting with a single enum per function is the right first step.
- Aside: Down the line we may want to handle this by generating a different enum for each
- Empty then branch, i.e. we want to be able to fall out of the
matchwithout running any code. This is representative of fallthrough control flow, either from aswitchor fromgoto.
So I think maybe something like this would suffice:
enum Then {
Empty,
Unreachable,
}
GotoTable {
..
then: Option<Then>,
}Note that this is also dependent on the changes in #1558, where I removed the then field from Multiple, so this may be easier to fix after rebasing on top of those changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't changed the then_body. It was previously the second element of a tuple variant, my changes just gave it a name for the sake of clarity. then_lbls is the new element I added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, I think there's a bit of confusion around the intended behavior of #1522, I tried to clarify in another comment. As noted there, the goal is that the _ case will either be either unreachable!() or {} depending on if there are unhandled entries to the Multiple. We want all other branches to be in the cases field, so we don't want a then field that's a full AST node. Though again, that's kind of a moot point because with #1558 we no longer generate match current_block for cases where a Multiple can have unhandled entries.
|
I'll wait for #1558 so that I have a better idea what things look like after. |
|
This might be less contentious if also phased after #1557, because the compiler will be able to reason about exhaustiveness of cases and there shouldn't be any situations where |
As I understand it, there may be more than one |
I've assumed here that the
entriesfield ofStructure::Multipleis exhaustive. Thus all labels inentriesminus those inbranchesshould be those that were previously covered by the wildcard pattern, and any other values are unreachable. I don't know if that's the case or not; if not, then this may turn out much harder to implement, as I don't know where to recover the missing labels from.