Java: Improving error recovery by doing a look-ahead match counting? #4530
Replies: 2 comments 1 reply
-
I am not sure if this is specifically to do with ANTLR5? However, a good way to get your head around this is not override the recovery strategy, then debug what happens in the runtime "as is" for various scenarios. You can also see that there are other strategies you can install. You shoudl also be able to see how the follow sets are used and stacked. That came out of me not wanting to drop out of an inner repeating rule because of a single missing or extra token in the JavaFX parser in V3. Ter then put that into v4 with supporting methods. Then try other strategies and your own custom strategies. Following in the debugger is a great way to see what methods the existing strategies use to support their actions. It looks like a custom strategy for AIRMET is a doable thing. Don't worry about performance in error recovery - that's already out the window by that point anyway :) |
Beta Was this translation helpful? Give feedback.
-
Moved back to antlr4 as requested, thanks for clarifying. |
Beta Was this translation helpful? Give feedback.
-
I'm trying to improve the error recovery handling for my particular use case, and one of the things I'm attempting to do is figure out what the "best" recovery action would be at any given time by calculating the number of tokens that would match if I took a particular recovery action. For example, I calculate the number of matches I might be able to achieve if I inserted a "missing" token, deleted an "extraneous" token, or even substituted an "invalid" token. Then using these counts, I select the one with the highest match count as the recover action to perform.
My current code is based on recursively evaluating the ATNState#getTransitions(), similar to the limited token look ahead the DefaultErrorStrategy uses. While it mostly works, I don't think I have it 100% correct, guessing something is off around epsilon/rule transitions.
At the moment I'm only using this logic in the recoverInline() function of my custom error strategy. However I'd like to be able to do something similar in sync(), which is a much more complex case. Due to the current short comings of the code, I haven't yet tried integrating with sync().
Code:
Two Questions:
In case it's relevant, my use case is for parsing textual aviation notices, AIRMET's specifically. These tend to be short with fairly well defined format, and with zero chance of fixing any errors as they are produced by various goverment agencies around the world. Further, I fully realize that what I suggest above has a performance cost, but given the shortness and relative frequency, I'm pretty sure the cost will be doable.
Example bulletin:
Thanks for any assistance you can provide
Beta Was this translation helpful? Give feedback.
All reactions