Edit documentation

enaix · enaix · commit d9e94e298888 · 2025-04-23T18:29:23.000+03:00
diff --git a/README.md b/README.md
@@ -76,6 +76,8 @@ constexpr auto ruleset = RulesDef(d_digit, d_number);
 
 ### Parser Initialization and Usage
 
+Parser/lexer configuration flags are described [in `docs/CONFIGURATION.md`](docs/CONFIGURATION.md)
+
 Once you have defined your grammar, you can create and use the parser:
 
 ```cpp
@@ -90,7 +92,8 @@ using TokenType = StdStr<char>; // Class used for storing a token type in runtim
 // Configure the parser with desired options
 constexpr auto conf = mk_sr_parser_conf<
     SRConfEnum::PrettyPrint,  // Enable pretty printing for debugging
-    SRConfEnum::Lookahead>(); // Enable lookahead(1)
+    SRConfEnum::Lookahead,    // Enable lookahead(1)
+    SRConfEnum::ReducibilityChecker>(); // Enable RC(1), which checks for reducibility for one step ahead
 
 // Initialize the lexer
 // There are two lexer types available:
@@ -103,7 +106,8 @@ auto legacy_lexer = make_lexer<VStr, TokenType>(ruleset, mk_lexer_conf<LexerConf
 // Use this for complex grammars where the same token may appear in different rules
 // (e.g., in JSON grammar where ',' appears in both object members and other contexts)
 
-// HandleDuplicates flag enables terms range support and tokens which are present in >2 rules at once. Imposes compile-time overhead on grammars with a high number of terminals
+// HandleDuplicates flag enables terms range support and tokens which are present in >2 rules at once. Imposes significant compile-time overhead on grammars with high number of terminals
+// HandleDupInRuntime flag moves symbols intersections handling to the lexer initialization in runtime
 
 auto advanced_lexer = make_lexer<VStr, TokenType>(ruleset, mk_lexer_conf<LexerConfEnum::AdvancedLexer, LexerConfEnum::HandleDuplicates>());
 
diff --git a/docs/CONFIGURATION.md b/docs/CONFIGURATION.md
@@ -0,0 +1,41 @@
+# Configuration options
+
+## Parser configuration
+
+During parser initialization, configuration enum flags may be passed.
+
+### `SRConfEnum::PrettyPrint`
+
+Enable dirty grammar info logging to stdout. Will be changed in the future
+
+### `SRConfEnum::Lookahead`
+
+Generate the FOLLOW set for the grammar and perform one symbol lookahead during parsing. Behavior may be changed in the future
+
+Prevents the reduction if the next symbol is of the same type. May be useful in cases with repeated elements
+
+### `SRConfEnum::ReducibilityChecker`
+
+Generate a RC(1) component which checks if a symbol can be reduced on the next step. For some match `m`, it checks if at least one parent symbol can be reduced up to current stack position.
+
+RC(1) routine keeps track of the current context, which is the current recursion depth of each rule. The major limitation of this algorithm is that it cannot discriminate between rules with equal prefixes.
+
+Note: RC(1) performs additional descent for each related type of the match, which significantly increases the runtime cost.
+
+## Lexer configuration
+
+### `LexerConfEnum::Legacy`
+
+A simple lexer which cannot handle duplicate terminals. Cannot handle terminals range. Enabled by default
+
+### `LexerConfEnum::AdvancedLexer`
+
+Lexer with duplicate terminals support, each terminal is uniquely defined as its string and rule. Slightly higher runtime cost, preferred over legacy lexer
+
+### `LexerConfEnum::HandleDuplicates`
+
+Manage duplicate terminals and terminals range. Related types of duplicate terminals are merged into one, while ranges are split into sub-ranges. Imposes significant compile-time overhead
+
+### `LexerConfEnum::HandleDupInRuntime`
+
+Move `LexerConfEnum::HandleDuplicates` initialization to runtime lexer class initialization. Does not affect performance during execution loop
diff --git a/docs/README.md b/docs/README.md
@@ -1,3 +1,7 @@
+# Documentation
+
+- [Configuration](docs/CONFIGURATION.md)
+
 ## Feature progress
 
 ## Lexer
@@ -7,7 +11,7 @@
 ## Global
 
 - [X] Implement `TermsRange` operator with the new lexer
-- [ ] Verify that ranges intersections work correctly
+- [X] Verify that ranges intersections work correctly
 - [ ] Add unicode (wchar_t) support
 - [ ] **Analyze how to minimize JIT compilation time with Cling**
 - [ ] Add reverse baking operation for constructing a grammar from Bakery, add support for SuperCFG to build itself
diff --git a/examples/json.cpp b/examples/json.cpp
@@ -80,7 +80,10 @@ int main()
 
     // Initialize the tokenizer
     //LexerLegacy<VStr, TokenType> lexer(ruleset);
-    auto lexer = make_lexer<VStr, TokenType>(ruleset, mk_lexer_conf<LexerConfEnum::AdvancedLexer, LexerConfEnum::HandleDuplicates, LexerConfEnum::HandleDupInRuntime>());
+    auto lexer = make_lexer<VStr, TokenType>(ruleset, mk_lexer_conf<
+        LexerConfEnum::AdvancedLexer,    // Enable advanced lexer
+        LexerConfEnum::HandleDuplicates, // Handle duplicate tokens
+        LexerConfEnum::HandleDupInRuntime>());
 
     // Create the shift-reduce parser
     // TreeNode<VStr> is the AST class