-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expression Parse Bomb #74
Comments
@IMax153 I'm aware you're the original source of the example Cookbook code - I'll take a look myself in your upstream work, but did you ever find a solution for this? |
Update: It seems this is likely a fundamental issue with parser combinators that is resolved by using packrat parsing - a potential fix is quite straight-forward, inside of the this.p = memoize(p, (state) => JSON.stringify(state)); |
@poteat - if I remember correctly, I did not guard well against infinite recursion, which is likely the problem here. |
@IMax153 Well, that may be true, although I never ran into any infinite recursion issues with your example code. I am confident that your code is very correct actually, for this reason: I implemented packrat parsing on our downstream fork, and it leads to sizeable performance improvements, and makes parse time linear with respect to input size. We see orders-of-magnitude improvement actually - it would be really cool if a similar fix was introduced in the base package. I'll wait for a maintainer to vet the idea though before I put a PR together. |
Ahh this is really interesting @poteat! Having just read a bit about packrat parsing, the optimisation seems like a definite win. Took a quick look at memoize-clear - it's only about 40 lines and not doing anything wildly complicated. In the interest of keeping arcsecond dependency free, I think it would be good to have our own implementation. |
The recommended code in the Cookbook for parsing arithmetic expressions introduces an exponential-complexity parse bomb, which I have found does actually become extremely relevant when using Arcsecond for parsing realistic expressions.
Demonstration code (converted to Typescript)
Suffice to say, the example input
"9 + ((((((((((5)))))))))) - 4 * 4 / 3"
takes twenty seconds on my machine. A contrived example, but again, I have seen unbounded parsing time for reasonable inputs as well.I have not yet been unable to wrap my head around where the inefficiency is coming from, or how to structure the parser to avoid the bomb. I really enjoy using this flavor of parser, and would be disappointed at having to write a parser manually to avoid this.
If I find out how to fix this, I'll update here so that we might update the Cookbook. Thanks again Francis for putting this library together - I'm super excited about your rewrite in Typescript.
The text was updated successfully, but these errors were encountered: