formatter continued 01 #22

FredTheDino · 2024-03-05T22:51:12Z

I continued my routine of trying to parse purescript code and fixing what is missing or broken.

During my fixing I added more debug info to the tests and handled crashes more transparently so they'll be easier trouble shooted in the future - the code is a bit ugly but I think it does the jobb.

We also have trouble with infinet loops in the parser - I've found one of them and did something that might or might not be a fix. I check the number of tokens we've eaten and if it stays constant we're stuck and I simply break out of the loop - now this method is not that sophisticated but solved some of the problems I have.

I also noticed we don't check that we've parsed the entire file when we generate our tests, that's something that we should look into if we are striving for correctness.

…tion

purefunctor · 2024-03-06T04:02:43Z

crates/parsing/src/grammar/rules.rs

@@ -363,6 +363,7 @@ fn expr_binding(parser: &mut Parser, separator: SyntaxKind) {
        expr_where(parser);
        marker.end(parser, SyntaxKind::UnconditionalBinding);
    } else {
+        // NOTE[et]: We get an infinet loop here since `expr_guarded` needs to beable to match an empty expression, so we just abort this loop if we have looped. It's a crude solution but it works to eliminate infinet loops in the parser. This logic lies inside the `one_or_more` parser


Do you have an example file that can cause this infinite loop?

A minimum reproduction that I conjured is:

b | if 0 else 1 then 2 = 0

This fails due to how expr_if is defined:

fn expr_if(parser: &mut Parser) { let mut marker = parser.start(); parser.expect(SyntaxKind::IfKw); expr_0(parser); parser.expect(SyntaxKind::ThenKw); expr_0(parser); parser.expect(SyntaxKind::ElseKw); expr_0(parser); marker.end(parser, SyntaxKind::IfThenElseExpression); }

More specifically, expect is basically eat but it emits an error if it isn't able to consume the current token. The new expect_recover function is more suitable for this, where it eats the current token indiscriminately while also emitting an error node for the CST.

That being said, loop-safety in the parser is definitely something we should strive for; checking if the parser made progress is a good solution, and I think we should also look into "recovering" as much as possible in the parsing rules.

purefunctor · 2024-03-06T04:11:46Z

...s/parsing/src/grammar/snapshots/[email protected]

-Start { kind: ConstrainedType }
+Start { kind: TypeOperatorChain }


Since => is built-in and cannot be re-defined, I think this should be parsed as ConstrainedType. I haven't implemented desugaring for TypeOperatorChain in the analyzer yet but ConstrainedType and ArrowType should be kept as-is.

purefunctor · 2024-03-06T04:15:14Z

Looks good so far, though I'd advise staying away from the combinators in favor of the new separated/repeat APIs like what #21 is currently doing.

FredTheDino added 16 commits March 5, 2024 17:32

Parse role anotations

68f1d3c

Thick arrows are operators too

b90597f

Underscores are starts to expressions

49326f8

Mitigate infinet loops from expression parse errors

06dc230

Parse newtypes

8c5af16

Update tests - these all look fairly right

ab1c8c3

Fix bug with type-parsing

2c5713f

Update all tests and fix a bug I introduced in the one_or_more func…

b485be4

…tion

Make the tests a bit more robust

a3008bd

Add more information to the infinet loop-crash

5c71117

Attempt to parse derive declarations

e65e5da

Update the tests - not sure if these are right

f3243d1

Clear error string is better than empty string

f2d1000

Fix the crash

a0e2caf

Write something that kinda looks like a syntax tree from the formatter

0e367b0

Parse a bunch of files instead

4da6f57

purefunctor reviewed Mar 6, 2024

View reviewed changes

purefunctor mentioned this pull request Mar 6, 2024

Thinking About Parser Safety #23

Open

Remove => from being a valid operator

704daf5

purefunctor mentioned this pull request Jun 18, 2024

Remove container nodes in CST #21

Merged

purefunctor force-pushed the main branch from 314810e to 8288752 Compare December 20, 2024 20:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

formatter continued 01 #22

formatter continued 01 #22

FredTheDino commented Mar 5, 2024

purefunctor Mar 6, 2024

purefunctor Mar 6, 2024

purefunctor Mar 6, 2024

purefunctor commented Mar 6, 2024

		Start { kind: ConstrainedType }
		Start { kind: TypeOperatorChain }

formatter continued 01 #22

Are you sure you want to change the base?

formatter continued 01 #22

Conversation

FredTheDino commented Mar 5, 2024

purefunctor Mar 6, 2024

Choose a reason for hiding this comment

purefunctor Mar 6, 2024

Choose a reason for hiding this comment

purefunctor Mar 6, 2024

Choose a reason for hiding this comment

purefunctor commented Mar 6, 2024