Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] Add COUNT() option to pest grammar #572

Open
mhatzl opened this issue Dec 18, 2021 · 3 comments
Open

[Request] Add COUNT() option to pest grammar #572

mhatzl opened this issue Dec 18, 2021 · 3 comments

Comments

@mhatzl
Copy link

mhatzl commented Dec 18, 2021

As mentioned in the discussion entry, I need to match a block of text that starts with a certain number of opening chars and then is closed with the same number of closing chars.

With a COUNT option in the grammar, this could easily be integrated like so:

block = { PUSH( "<"{3 , } ) ~ ( !( ">"{ COUNT( PEEK ) } ) ~ ANY )+ ~ ">"{ COUNT( POP ) } }

I am not familiar with the implementation of pest, so I don't know how complex it would be to integrate this feature,
but @nfejzic and I would be interested in helping to implement it.

Currently, it might be possible to solve the above rule in another way, but if nesting should be allowed, where the outer chars must be at least one char longer, I haven't found a way to solve this with the current pest rule syntax.

Any help is greatly appreciated.

@nfejzic
Copy link

nfejzic commented Feb 9, 2022

Any update on this?

It would be really helpful if someone answered with a yes, no or even maybe. Any feedback whatsoever would be welcome.

Especially since we're offering to implement this ourselves.
We don't want to start if the community does not want such functionality though...

@Tartasprint
Copy link
Contributor

If you need the same number, you can write a recursive rule:

block = { "{" ~ (block | "A".."Z") ~ "}" }

Since this will give you an AST with many useless nested blocks, you could do something like:

block = { nested }
nested = _{ "{" ~ (nested | content) ~ "}" }
content = {'A'..'Z'+}

This I think answer your needs, but does not introduce the COUNT feature.

@nfejzic
Copy link

nfejzic commented Oct 15, 2023

Thanks for you answer @Tartasprint!

The solution you mentioned would not work in our case. The problems we have is that blocks use same symbol for open/close delimiters, and we want to allow inner blocks iff they have at least 1 symbol more for delimiters. For example

~~~
outer block

~~~~
inner block
~~~~

and close outer block
~~~

I don't think that this or similar constructs would work without the ability to count the symbols.

In any case, we eventually figured out that we need to roll our own parser anyway, so we aren't looking for solution to this right now. As far as I'm concerned, this issue can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants