Skip to content

Commit 998853d

Browse files
committed
Move code examples into docs
1 parent d9e94e2 commit 998853d

File tree

4 files changed

+240
-221
lines changed

4 files changed

+240
-221
lines changed

README.md

Lines changed: 9 additions & 220 deletions
Original file line numberDiff line numberDiff line change
@@ -76,73 +76,17 @@ constexpr auto ruleset = RulesDef(d_digit, d_number);
7676

7777
### Parser Initialization and Usage
7878

79+
See [`docs/USAGE.md`](docs/USAGE.md)
80+
7981
Parser/lexer configuration flags are described [in `docs/CONFIGURATION.md`](docs/CONFIGURATION.md)
8082

81-
Once you have defined your grammar, you can create and use the parser:
83+
## Symbols and operators
8284

83-
```cpp
84-
// ...
85+
`Term(name)` is a terminal character. Currently it fully supports only single-character terminals
8586

86-
// rules definition
87-
88-
// Define string container types for your parser
89-
using VStr = StdStr<char>; // Variable string class inherited from std::string<TChar>
90-
using TokenType = StdStr<char>; // Class used for storing a token type in runtime
91-
92-
// Configure the parser with desired options
93-
constexpr auto conf = mk_sr_parser_conf<
94-
SRConfEnum::PrettyPrint, // Enable pretty printing for debugging
95-
SRConfEnum::Lookahead, // Enable lookahead(1)
96-
SRConfEnum::ReducibilityChecker>(); // Enable RC(1), which checks for reducibility for one step ahead
97-
98-
// Initialize the lexer
99-
// There are two lexer types available:
100-
101-
// 1. Legacy Lexer - simpler but with limitations
102-
// Use this for simple grammars where tokens don't appear in multiple rules
103-
auto legacy_lexer = make_lexer<VStr, TokenType>(ruleset, mk_lexer_conf<LexerConfEnum::Legacy>());
104-
105-
// 2. Advanced Lexer - more powerful but slightly slower
106-
// Use this for complex grammars where the same token may appear in different rules
107-
// (e.g., in JSON grammar where ',' appears in both object members and other contexts)
108-
109-
// HandleDuplicates flag enables terms range support and tokens which are present in >2 rules at once. Imposes significant compile-time overhead on grammars with high number of terminals
110-
// HandleDupInRuntime flag moves symbols intersections handling to the lexer initialization in runtime
111-
112-
auto advanced_lexer = make_lexer<VStr, TokenType>(ruleset, mk_lexer_conf<LexerConfEnum::AdvancedLexer, LexerConfEnum::HandleDuplicates>());
113-
114-
115-
// Create the shift-reduce parser
116-
// TreeNode<VStr> is the AST class
117-
auto parser = make_sr_parser<VStr, TokenType, TreeNode<VStr>>(ruleset, advanced_lexer, conf);
118-
119-
VStr input("12345");
120-
bool ok;
121-
122-
// Tokenize the input
123-
auto tokens = advanced_lexer.run(input, ok);
124-
125-
if (ok) {
126-
// Create a parse tree
127-
TreeNode<VStr> tree;
128-
129-
// Parse the tokens with 'number' being the root
130-
ok = parser.run(tree, number, tokens);
131-
132-
if (ok) {
133-
// Process the parse tree
134-
tree.traverse([&](const auto& node, std::size_t depth) {
135-
// Print the tree structure
136-
for (std::size_t i = 0; i < depth; i++)
137-
std::cout << "| ";
138-
std::cout << node.name << " (" << node.nodes.size()
139-
<< " elems) : " << node.value << std::endl;
140-
});
141-
}
142-
}
143-
```
87+
`TermsRange(start, end)` is a range of terminals, which lexicographically iterates over the range `[start, end]`. Note that the exact order depends on the char type.
14488

145-
## Operators
89+
`NTerm(name)` is a nonterminal with a unique name, which describes its type.
14690

14791
### Basic Operators
14892

@@ -170,163 +114,8 @@ if (ok) {
170114

171115
## Grammar serialization
172116

173-
SuperCFG can serialize grammar rules to a custom EBNF-like notation at compile-time. The serialization is done through the `.bake()` method:
174-
175-
```cpp
176-
// Define your grammar rules class
177-
constexpr EBNFBakery rules;
178-
179-
// Define non-terminals and terminals
180-
constexpr auto nozero = NTerm(cs<"digit excluding zero">());
181-
constexpr auto d_nozero = Define(nozero, Alter(
182-
Term(cs<"1">()), Term(cs<"2">()), Term(cs<"3">()),
183-
Term(cs<"4">()), Term(cs<"5">()), Term(cs<"6">()),
184-
Term(cs<"7">()), Term(cs<"8">()), Term(cs<"9">())
185-
));
186-
187-
// Reference other non-terminals in definitions
188-
constexpr auto d_digit = Define(NTerm(cs<"digit">()),
189-
Alter(Term(cs<"0">()), nozero)
190-
);
191-
192-
// Combine rules and serialize to EBNF-like notation
193-
constexpr auto root = RulesDef(d_nozero, d_digit).bake(rules);
194-
195-
// Now root contains a compile-time string with the serialized grammar:
196-
// "digit excluding zero = \"1\" | \"2\" | \"3\" | \"4\" | \"5\" | \"6\" | \"7\" | \"8\" | \"9\" ;\n"
197-
// "digit = \"0\" | digit excluding zero ;"
198-
199-
// You can output this string at runtime:
200-
std::cout << root.c_str() << std::endl;
201-
202-
// Or even use static_assert to verify the grammar at compile-time:
203-
constexpr char expected[] = "digit excluding zero = \"1\" | \"2\" | \"3\" | \"4\" | \"5\" | \"6\" | \"7\" | \"8\" | \"9\" ;\n"
204-
"digit = \"0\" | digit excluding zero ;";
205-
static_assert(cs<expected>() == root, "Grammar has changed!");
206-
```
207-
208-
The serialization also supports automatic operators grouping by analyzing their order.
209-
210-
## Complex Examples
211-
212-
### Calculator Grammar
213-
214-
Here's a more complex example for a simple calculator grammar that supports basic arithmetic operations and parentheses:
215-
216-
```cpp
217-
// Define basic number components
218-
constexpr auto digit = NTerm(cs<"digit">());
219-
constexpr auto d_digit = Define(digit, Repeat(Alter(Term(cs<"1">()), Term(cs<"2">()), /* ... */)));
220-
221-
constexpr auto number = NTerm(cs<"number">());
222-
constexpr auto d_number = Define(number, Repeat(digit));
223-
224-
// Define arithmetic operations
225-
constexpr auto add = NTerm(cs<"add">());
226-
constexpr auto sub = NTerm(cs<"sub">());
227-
constexpr auto mul = NTerm(cs<"mul">());
228-
constexpr auto div = NTerm(cs<"div">());
229-
constexpr auto op = NTerm(cs<"op">());
230-
constexpr auto arithmetic = NTerm(cs<"arithmetic">());
231-
constexpr auto group = NTerm(cs<"group">());
232-
233-
// No operator order defined: grammar is ambiguous
234-
constexpr auto d_add = Define(add, Concat(op, Term(cs<"+">()), op));
235-
constexpr auto d_sub = Define(sub, Concat(op, Term(cs<"-">()), op));
236-
constexpr auto d_mul = Define(mul, Concat(op, Term(cs<"*">()), op));
237-
constexpr auto d_div = Define(div, Concat(op, Term(cs<"/">()), op));
238-
239-
// Define grouping and operator rules
240-
constexpr auto d_group = Define(group, Concat(Term(cs<"(">()), op, Term(cs<")">())));
241-
constexpr auto d_arithmetic = Define(arithmetic, Alter(add, sub, mul, div));
242-
constexpr auto d_op = Define(op, Alter(number, arithmetic, group));
243-
244-
// Combine all rules
245-
constexpr auto ruleset = RulesDef(d_digit, d_number, d_add, d_sub, d_mul, d_div,
246-
d_arithmetic, d_op, d_group);
247-
```
248-
249-
### JSON Grammar
250-
251-
Here's a complete JSON grammar example that supports objects, arrays, strings, numbers, booleans, and null values:
252-
253-
```cpp
254-
// Define character set for strings
255-
constexpr char s[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ _-.!";
256-
257-
// Define basic components
258-
constexpr auto character = NTerm(cs<"char">());
259-
constexpr auto digit = NTerm(cs<"digit">());
260-
constexpr auto number = NTerm(cs<"number">());
261-
constexpr auto boolean = NTerm(cs<"bool">());
262-
constexpr auto json = NTerm(cs<"json">());
263-
constexpr auto object = NTerm(cs<"object">());
264-
constexpr auto null = NTerm(cs<"null">());
265-
constexpr auto string = NTerm(cs<"string">());
266-
constexpr auto array = NTerm(cs<"array">());
267-
constexpr auto member = NTerm(cs<"member">());
268-
269-
// Define character set using build_range helper
270-
constexpr auto d_character = Define(character, Repeat(build_range(cs<s>(),
271-
[](const auto&... str){ return Alter(Term(str)...); },
272-
std::make_index_sequence<sizeof(s)-1>{})));
273-
274-
// Define number components
275-
constexpr auto d_digit = Define(digit, Repeat(Alter(
276-
Term(cs<"1">()), Term(cs<"2">()), Term(cs<"3">()),
277-
Term(cs<"4">()), Term(cs<"5">()), Term(cs<"6">()),
278-
Term(cs<"7">()), Term(cs<"8">()), Term(cs<"9">()),
279-
Term(cs<"0">())
280-
)));
281-
constexpr auto d_number = Define(number, Repeat(digit));
282-
283-
// Define JSON value types
284-
constexpr auto d_boolean = Define(boolean, Alter(Term(cs<"true">()), Term(cs<"false">())));
285-
constexpr auto d_null = Define(null, Term(cs<"null">()));
286-
constexpr auto d_string = Define(string, Concat(Term(cs<"\"">()), Repeat(character), Term(cs<"\"">())));
287-
288-
// Define array and object structures
289-
constexpr auto d_array = Define(array, Concat(
290-
Term(cs<"[">()),
291-
json,
292-
Repeat(Concat(Term(cs<",">()), json)),
293-
Term(cs<"]">())
294-
));
295-
296-
constexpr auto d_member = Define(member, Concat(
297-
json,
298-
Term(cs<":">()),
299-
json
300-
));
301-
302-
constexpr auto d_object = Define(object, Concat(
303-
Term(cs<"{">()),
304-
member,
305-
Repeat(Concat(Term(cs<",">()), member)),
306-
Term(cs<"}">())
307-
));
308-
309-
// Define the root JSON rule
310-
constexpr auto d_json = Define(json, Alter(
311-
array, boolean, null, number, object, string
312-
));
313-
314-
// Combine all rules
315-
constexpr auto ruleset = RulesDef(
316-
d_character, d_digit, d_number, d_boolean,
317-
d_null, d_string, d_array, d_member,
318-
d_object, d_json
319-
);
320-
```
321-
322-
This JSON grammar supports:
323-
- Numbers (e.g., `42`, `123`)
324-
- Strings (e.g., `"hello"`, `"world"`)
325-
- Booleans (`true`, `false`)
326-
- Null values (`null`)
327-
- Arrays (e.g., `[1,2,3]`, `["a","b","c"]`)
328-
- Objects (e.g., `{"key":"value"}`, `{"a":1,"b":2}`)
329-
- Nested structures (e.g., `{"a":[1,2,3],"b":{"c":"d"}}`)
117+
SuperCFG can serialize grammar rules to a custom EBNF-like notation at compile-time. See [`docs/USAGE.md`](docs/USAGE.md#grammar-serialization)
330118

331-
JSON parser without escape characters and whitespaces can be found in `examples/json.cpp`. Example strings: `42`, `"hello"`, `[1,2,3]`, `[1,["abc",2],["d","e","f"]]`, `{"a":123,"b":456}`, `{"a":[1,2,"asdf"];"b":["q","w","e"]}`, `{"a":{"b":42,"c":"abc"};"qwerty":{1:"uiop",42:10}}`
119+
## Examples
332120

121+
Some of the examples can be found [in `docs/EXAMPLES.md`](docs/EXAMPLES.md). The code is located in `examples/json.cpp`

docs/EXAMPLES.md

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
# Examples
2+
3+
## Calculator Grammar
4+
5+
Here's an example for a simple calculator grammar that supports basic arithmetic operations:
6+
7+
```cpp
8+
// Define basic number components
9+
constexpr auto digit = NTerm(cs<"digit">());
10+
constexpr auto d_digit = Define(digit, Repeat(Alter(Term(cs<"1">()), Term(cs<"2">()), /* ... */)));
11+
12+
constexpr auto number = NTerm(cs<"number">());
13+
constexpr auto d_number = Define(number, Repeat(digit));
14+
15+
// Define arithmetic operations
16+
constexpr auto add = NTerm(cs<"add">());
17+
constexpr auto sub = NTerm(cs<"sub">());
18+
constexpr auto mul = NTerm(cs<"mul">());
19+
constexpr auto div = NTerm(cs<"div">());
20+
constexpr auto op = NTerm(cs<"op">());
21+
constexpr auto arithmetic = NTerm(cs<"arithmetic">());
22+
constexpr auto group = NTerm(cs<"group">());
23+
24+
// No operator order defined: grammar is ambiguous
25+
constexpr auto d_add = Define(add, Concat(op, Term(cs<"+">()), op));
26+
constexpr auto d_sub = Define(sub, Concat(op, Term(cs<"-">()), op));
27+
constexpr auto d_mul = Define(mul, Concat(op, Term(cs<"*">()), op));
28+
constexpr auto d_div = Define(div, Concat(op, Term(cs<"/">()), op));
29+
30+
// Define grouping and operator rules
31+
constexpr auto d_group = Define(group, Concat(Term(cs<"(">()), op, Term(cs<")">())));
32+
constexpr auto d_arithmetic = Define(arithmetic, Alter(add, sub, mul, div));
33+
constexpr auto d_op = Define(op, Alter(number, arithmetic, group));
34+
35+
// Combine all rules
36+
constexpr auto ruleset = RulesDef(d_digit, d_number, d_add, d_sub, d_mul, d_div,
37+
d_arithmetic, d_op, d_group);
38+
```
39+
40+
## JSON Grammar
41+
42+
Here's a complete JSON grammar example that supports objects, arrays, strings, numbers, booleans, and null values:
43+
44+
```cpp
45+
// Define character set for strings
46+
constexpr char s[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ _-.!";
47+
48+
// Define basic components
49+
constexpr auto character = NTerm(cs<"char">());
50+
constexpr auto digit = NTerm(cs<"digit">());
51+
constexpr auto number = NTerm(cs<"number">());
52+
constexpr auto boolean = NTerm(cs<"bool">());
53+
constexpr auto json = NTerm(cs<"json">());
54+
constexpr auto object = NTerm(cs<"object">());
55+
constexpr auto null = NTerm(cs<"null">());
56+
constexpr auto string = NTerm(cs<"string">());
57+
constexpr auto array = NTerm(cs<"array">());
58+
constexpr auto member = NTerm(cs<"member">());
59+
60+
// Define character set using build_range helper
61+
constexpr auto d_character = Define(character, Repeat(build_range(cs<s>(),
62+
[](const auto&... str){ return Alter(Term(str)...); },
63+
std::make_index_sequence<sizeof(s)-1>{})));
64+
65+
// Define number components
66+
constexpr auto d_digit = Define(digit, Repeat(Alter(
67+
Term(cs<"1">()), Term(cs<"2">()), Term(cs<"3">()),
68+
Term(cs<"4">()), Term(cs<"5">()), Term(cs<"6">()),
69+
Term(cs<"7">()), Term(cs<"8">()), Term(cs<"9">()),
70+
Term(cs<"0">())
71+
)));
72+
constexpr auto d_number = Define(number, Repeat(digit));
73+
74+
// Define JSON value types
75+
constexpr auto d_boolean = Define(boolean, Alter(Term(cs<"true">()), Term(cs<"false">())));
76+
constexpr auto d_null = Define(null, Term(cs<"null">()));
77+
constexpr auto d_string = Define(string, Concat(Term(cs<"\"">()), Repeat(character), Term(cs<"\"">())));
78+
79+
// Define array and object structures
80+
constexpr auto d_array = Define(array, Concat(
81+
Term(cs<"[">()),
82+
json,
83+
Repeat(Concat(Term(cs<",">()), json)),
84+
Term(cs<"]">())
85+
));
86+
87+
constexpr auto d_member = Define(member, Concat(
88+
json,
89+
Term(cs<":">()),
90+
json
91+
));
92+
93+
constexpr auto d_object = Define(object, Concat(
94+
Term(cs<"{">()),
95+
member,
96+
Repeat(Concat(Term(cs<",">()), member)),
97+
Term(cs<"}">())
98+
));
99+
100+
// Define the root JSON rule
101+
constexpr auto d_json = Define(json, Alter(
102+
array, boolean, null, number, object, string
103+
));
104+
105+
// Combine all rules
106+
constexpr auto ruleset = RulesDef(
107+
d_character, d_digit, d_number, d_boolean,
108+
d_null, d_string, d_array, d_member,
109+
d_object, d_json
110+
);
111+
```
112+
113+
This JSON grammar supports:
114+
- Numbers (e.g., `42`, `123`)
115+
- Strings (e.g., `"hello"`, `"world"`)
116+
- Arrays (e.g., `[1,2,3]`, `["a","b","c"]`)
117+
- Objects (e.g., `{"key":"value"}`, `{"a":1,"b":2}`)
118+
- Nested structures (e.g., `{"a":[1,2,3],"b":{"c":"d"}}`)
119+
120+
JSON parser without escape characters and whitespaces can be found in `examples/json.cpp`. Example strings: `42`, `"hello"`, `[1,2,3]`, `[1,["abc",2],["d","e","f"]]`, `{"a":123,"b":456}`, `{"a":[1,2,"asdf"],"b":["q","w","e"]}`, `{"a":{"b":42,"c":"abc"},"qwerty":{1:"uiop",42:10}}`

docs/README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
# Documentation
22

3-
- [Configuration](docs/CONFIGURATION.md)
3+
- [Usage](USAGE.md)
4+
- [Configuration](CONFIGURATION.md)
5+
- [Examples](EXAMPLES.md)
46

57
## Feature progress
68

0 commit comments

Comments
 (0)