You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+6-2Lines changed: 6 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -76,6 +76,8 @@ constexpr auto ruleset = RulesDef(d_digit, d_number);
76
76
77
77
### Parser Initialization and Usage
78
78
79
+
Parser/lexer configuration flags are described [in `docs/CONFIGURATION.md`](docs/CONFIGURATION.md)
80
+
79
81
Once you have defined your grammar, you can create and use the parser:
80
82
81
83
```cpp
@@ -90,7 +92,8 @@ using TokenType = StdStr<char>; // Class used for storing a token type in runtim
90
92
// Configure the parser with desired options
91
93
constexprauto conf = mk_sr_parser_conf<
92
94
SRConfEnum::PrettyPrint, // Enable pretty printing for debugging
93
-
SRConfEnum::Lookahead>(); // Enable lookahead(1)
95
+
SRConfEnum::Lookahead, // Enable lookahead(1)
96
+
SRConfEnum::ReducibilityChecker>(); // Enable RC(1), which checks for reducibility for one step ahead
94
97
95
98
// Initialize the lexer
96
99
// There are two lexer types available:
@@ -103,7 +106,8 @@ auto legacy_lexer = make_lexer<VStr, TokenType>(ruleset, mk_lexer_conf<LexerConf
103
106
// Use this for complex grammars where the same token may appear in different rules
104
107
// (e.g., in JSON grammar where ',' appears in both object members and other contexts)
105
108
106
-
// HandleDuplicates flag enables terms range support and tokens which are present in >2 rules at once. Imposes compile-time overhead on grammars with a high number of terminals
109
+
// HandleDuplicates flag enables terms range support and tokens which are present in >2 rules at once. Imposes significant compile-time overhead on grammars with high number of terminals
110
+
// HandleDupInRuntime flag moves symbols intersections handling to the lexer initialization in runtime
107
111
108
112
auto advanced_lexer = make_lexer<VStr, TokenType>(ruleset, mk_lexer_conf<LexerConfEnum::AdvancedLexer, LexerConfEnum::HandleDuplicates>());
During parser initialization, configuration enum flags may be passed.
6
+
7
+
### `SRConfEnum::PrettyPrint`
8
+
9
+
Enable dirty grammar info logging to stdout. Will be changed in the future
10
+
11
+
### `SRConfEnum::Lookahead`
12
+
13
+
Generate the FOLLOW set for the grammar and perform one symbol lookahead during parsing. Behavior may be changed in the future
14
+
15
+
Prevents the reduction if the next symbol is of the same type. May be useful in cases with repeated elements
16
+
17
+
### `SRConfEnum::ReducibilityChecker`
18
+
19
+
Generate a RC(1) component which checks if a symbol can be reduced on the next step. For some match `m`, it checks if at least one parent symbol can be reduced up to current stack position.
20
+
21
+
RC(1) routine keeps track of the current context, which is the current recursion depth of each rule. The major limitation of this algorithm is that it cannot discriminate between rules with equal prefixes.
22
+
23
+
Note: RC(1) performs additional descent for each related type of the match, which significantly increases the runtime cost.
24
+
25
+
## Lexer configuration
26
+
27
+
### `LexerConfEnum::Legacy`
28
+
29
+
A simple lexer which cannot handle duplicate terminals. Cannot handle terminals range. Enabled by default
30
+
31
+
### `LexerConfEnum::AdvancedLexer`
32
+
33
+
Lexer with duplicate terminals support, each terminal is uniquely defined as its string and rule. Slightly higher runtime cost, preferred over legacy lexer
34
+
35
+
### `LexerConfEnum::HandleDuplicates`
36
+
37
+
Manage duplicate terminals and terminals range. Related types of duplicate terminals are merged into one, while ranges are split into sub-ranges. Imposes significant compile-time overhead
38
+
39
+
### `LexerConfEnum::HandleDupInRuntime`
40
+
41
+
Move `LexerConfEnum::HandleDuplicates` initialization to runtime lexer class initialization. Does not affect performance during execution loop
0 commit comments