-
-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] support custom state and hooks #815
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Alex Chi <[email protected]>
Signed-off-by: Alex Chi <[email protected]>
Signed-off-by: Alex Chi <[email protected]>
Signed-off-by: Alex Chi <[email protected]>
This feels like a job for the compiler, not the parser. I know that some PEG libraries can do this, such as the one Python uses for interpreting Python code, but in generic libraries, makes me think we're overstepping the core premise. I'm willing to be convinced, just want to get that out first. |
I totally agree that this adds too much complexity for the parser. But meanwhile I feel without custom state and hooks, it will be hard to do some context-aware parsing. An example is parsing a C-like language.
Whether to parse I'm also thinking of removing Hence, an alternative is to support a syntax like:
so that we can run some custom action to check / save something, while error recovery becomes some kind of undefined behavior. |
It's both a strength and a limitation of pest that it sticks to pure1 PEG, without any sort of user actions in the grammar. If pest does want to support non-PEG semantics, I believe the way to go about doing so is to add the ability to fully define custom Any newly designed language SHOULD NOT emulate C's lexer hack. The documentation should make it clear that using the functionality is discouraged and should be avoided if that's at all an option. To make these special rules stick out more (whether they're just binary acceptance functions or more sophisticated custom rules), they should look different from normal rules. Three arbitrary proposals could be The save/restore interface also imho should be improved before it's exposed to consumers; it should either just be a Ultimately, I don't have any strong reason to block the availability of this functionality in pest. It feels very "pest3" in that it's adding entirely new capabilities on top of the existing jank, but it's certainly a need that people feel, so it's valid to serve that need. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this can be added to the existing codebase in a non-breaking way. I agree with what was mentioned above that it has use cases and it's perhaps a better fit with a "simplified" / less complex grammar (as in pest3: #333 (comment) )
@skyzh would you be interested in dusting off https://github.com/pest-parser/pest3 and trying to add hooks there? Once it's somewhat useable, we can perhaps add pest3 stuff here under a feature flag and/or v3 modules until it's stabilized and there's tooling that can help people to switch to a new grammar?
|
||
/// A trait with a single method that parses strings. | ||
pub trait Parser<R: RuleType> { | ||
pub trait Parser<R: RuleType, S: Default + StateCheckpoint = ()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't this be a semver-breaking change? i.e. existing pest 2.X-generated code implemented the trait with one param and wouldn't compile with this/would need to be regenerated: https://github.com/pest-parser/pest/actions/runs/4276068348/jobs/7480923758#step:4:253
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, that's more breakage from the fact that the runtime and derive library versions aren't pinned to each other, isn't it... adding a new defaulted generic to a type is backwards compatible, as old code will use the default, but it's not forwards compatible, in that generated code supplying the parameter won't work with the older library versions without it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bootstrap path will always be interesting because it has a weird mix of using the local version for some stuff and the crates-io version for other stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, yes, it's not forwards-compatible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but that should be ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(sans figuring out the bootstrap)
Thanks a lot for your information!
I personally would prefer this grammar.
Agree.
I can take a look but I'm not sure if it will work 🤪 The pest3 repo doesn't seem to have development activities since 2 years ago...
I tried designing the API to be compatible but it seems that this will be a semver-breaking change. If users have manually implemented the Parser trait, probably they will need to change it... Therefore, I would probably make hook behind a feature gate. I think people still have diverged opinions on whether to have hooks built-in pest. Probably it would be a good idea to leave this PR open and see if there's any need for this feature in the future? Meanwhile, to make hooks more useable in pest, I'll probably maintain my version of pest in this branch and see if there're potential users that are interested in using that...
Thanks for your reviews and comments! |
Yeah, the pest3 repo is somewhat abandoned -- so it'd rather mean restarting the work of that reworked / simplified pest codebase there. In any case, you can probably proceed as you outlined by continuing the updates in this PR. I think potentially this PR could be merged if the hooks functionality is exposed under a non-default feature flag. |
Guided-Level Explanation
This PR adds a "context" for parsing with pest (for one of my course projects where I'm using pest...). Now users can parse with a custom state as a context. One example is the following grammar, where we want
ints
to parse a sequence of non-decrementing numbers.For all terms beginning with
__HOOK
, the generator will call user-defined hooks before deciding whether to parse.The state will need to support snapshot / recovery so that the hook can be placed anywhere. But users can also opt-out recovery and modify their grammar to avoid this situation. Examples in
hook.rs
. andhook.pest
.Implementation-Level Explanation
ParseState
now contains a field for storing custom state. It is by-default set to()
.We now have a new parser trait called
StateParser
, where we accept a state + input and return rules + state.The generator crate is refactored to implement custom state generic type for all traits. Some functions also have new signatures in order to accept custom state.
The API change doesn't seem to affect any existing examples.
This PR is a demo and I'm open to suggestions / changes to this new feature. If it doesn't work I can also maintain it in my own fork. Thanks for review!