Skip to content

Commit d9e94e2

Browse files
committed
Edit documentation
1 parent 3c9697e commit d9e94e2

File tree

4 files changed

+56
-4
lines changed

4 files changed

+56
-4
lines changed

README.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,8 @@ constexpr auto ruleset = RulesDef(d_digit, d_number);
7676

7777
### Parser Initialization and Usage
7878

79+
Parser/lexer configuration flags are described [in `docs/CONFIGURATION.md`](docs/CONFIGURATION.md)
80+
7981
Once you have defined your grammar, you can create and use the parser:
8082

8183
```cpp
@@ -90,7 +92,8 @@ using TokenType = StdStr<char>; // Class used for storing a token type in runtim
9092
// Configure the parser with desired options
9193
constexpr auto conf = mk_sr_parser_conf<
9294
SRConfEnum::PrettyPrint, // Enable pretty printing for debugging
93-
SRConfEnum::Lookahead>(); // Enable lookahead(1)
95+
SRConfEnum::Lookahead, // Enable lookahead(1)
96+
SRConfEnum::ReducibilityChecker>(); // Enable RC(1), which checks for reducibility for one step ahead
9497

9598
// Initialize the lexer
9699
// There are two lexer types available:
@@ -103,7 +106,8 @@ auto legacy_lexer = make_lexer<VStr, TokenType>(ruleset, mk_lexer_conf<LexerConf
103106
// Use this for complex grammars where the same token may appear in different rules
104107
// (e.g., in JSON grammar where ',' appears in both object members and other contexts)
105108

106-
// HandleDuplicates flag enables terms range support and tokens which are present in >2 rules at once. Imposes compile-time overhead on grammars with a high number of terminals
109+
// HandleDuplicates flag enables terms range support and tokens which are present in >2 rules at once. Imposes significant compile-time overhead on grammars with high number of terminals
110+
// HandleDupInRuntime flag moves symbols intersections handling to the lexer initialization in runtime
107111

108112
auto advanced_lexer = make_lexer<VStr, TokenType>(ruleset, mk_lexer_conf<LexerConfEnum::AdvancedLexer, LexerConfEnum::HandleDuplicates>());
109113

docs/CONFIGURATION.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# Configuration options
2+
3+
## Parser configuration
4+
5+
During parser initialization, configuration enum flags may be passed.
6+
7+
### `SRConfEnum::PrettyPrint`
8+
9+
Enable dirty grammar info logging to stdout. Will be changed in the future
10+
11+
### `SRConfEnum::Lookahead`
12+
13+
Generate the FOLLOW set for the grammar and perform one symbol lookahead during parsing. Behavior may be changed in the future
14+
15+
Prevents the reduction if the next symbol is of the same type. May be useful in cases with repeated elements
16+
17+
### `SRConfEnum::ReducibilityChecker`
18+
19+
Generate a RC(1) component which checks if a symbol can be reduced on the next step. For some match `m`, it checks if at least one parent symbol can be reduced up to current stack position.
20+
21+
RC(1) routine keeps track of the current context, which is the current recursion depth of each rule. The major limitation of this algorithm is that it cannot discriminate between rules with equal prefixes.
22+
23+
Note: RC(1) performs additional descent for each related type of the match, which significantly increases the runtime cost.
24+
25+
## Lexer configuration
26+
27+
### `LexerConfEnum::Legacy`
28+
29+
A simple lexer which cannot handle duplicate terminals. Cannot handle terminals range. Enabled by default
30+
31+
### `LexerConfEnum::AdvancedLexer`
32+
33+
Lexer with duplicate terminals support, each terminal is uniquely defined as its string and rule. Slightly higher runtime cost, preferred over legacy lexer
34+
35+
### `LexerConfEnum::HandleDuplicates`
36+
37+
Manage duplicate terminals and terminals range. Related types of duplicate terminals are merged into one, while ranges are split into sub-ranges. Imposes significant compile-time overhead
38+
39+
### `LexerConfEnum::HandleDupInRuntime`
40+
41+
Move `LexerConfEnum::HandleDuplicates` initialization to runtime lexer class initialization. Does not affect performance during execution loop

docs/README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
# Documentation
2+
3+
- [Configuration](docs/CONFIGURATION.md)
4+
15
## Feature progress
26

37
## Lexer
@@ -7,7 +11,7 @@
711
## Global
812

913
- [X] Implement `TermsRange` operator with the new lexer
10-
- [ ] Verify that ranges intersections work correctly
14+
- [X] Verify that ranges intersections work correctly
1115
- [ ] Add unicode (wchar_t) support
1216
- [ ] **Analyze how to minimize JIT compilation time with Cling**
1317
- [ ] Add reverse baking operation for constructing a grammar from Bakery, add support for SuperCFG to build itself

examples/json.cpp

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,10 @@ int main()
8080

8181
// Initialize the tokenizer
8282
//LexerLegacy<VStr, TokenType> lexer(ruleset);
83-
auto lexer = make_lexer<VStr, TokenType>(ruleset, mk_lexer_conf<LexerConfEnum::AdvancedLexer, LexerConfEnum::HandleDuplicates, LexerConfEnum::HandleDupInRuntime>());
83+
auto lexer = make_lexer<VStr, TokenType>(ruleset, mk_lexer_conf<
84+
LexerConfEnum::AdvancedLexer, // Enable advanced lexer
85+
LexerConfEnum::HandleDuplicates, // Handle duplicate tokens
86+
LexerConfEnum::HandleDupInRuntime>());
8487

8588
// Create the shift-reduce parser
8689
// TreeNode<VStr> is the AST class

0 commit comments

Comments
 (0)