You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+24-16Lines changed: 24 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,56 +11,56 @@
11
11
Semantic analyzer is an open source semantic analyzer for programming languages
12
12
that makes it easy to build your own efficient compilers.
13
13
14
-
## What is the library for and what tasks does it solve
14
+
## 🌀 What is the library for and what tasks does it solve
15
15
16
16
Creating a compilers for a programming language is process that involves several key
17
17
stages. Most commonly it is:
18
18
19
-
-**Lexical Analysis (Lexer)**: This stage involves breaking down the input stream
19
+
▶️**Lexical Analysis (Lexer)**: This stage involves breaking down the input stream
20
20
of characters into a series of tokens. Tokens are the atomic elements of the programming language, such as identifiers, keywords, operators, etc.
21
21
22
-
-**Syntax Analysis (Parsing)**: At this stage, the tokens obtained in the previous
22
+
▶️**Syntax Analysis (Parsing)**: At this stage, the tokens obtained in the previous
23
23
stage are grouped according to the grammar rules of the programming language. The result
24
24
of this process is an **Abstract Syntax Tree (AST)**, which represents a hierarchical structure of the code.
25
25
26
-
-**Semantic Analysis**: This stage involves checking the semantic correctness of the code. This can include
26
+
⏩**Semantic Analysis**: This stage involves checking the semantic correctness of the code. This can include
27
27
type checking, scope verification of variables, etc.
28
28
29
-
-**Intermediate Code Optimization**: At this stage, the compiler tries to improve the intermediate representation of the code to make it more efficient.
29
+
▶️**Intermediate Code Optimization**: At this stage, the compiler tries to improve the intermediate representation of the code to make it more efficient.
30
30
This can include dead code elimination, expression simplification, etc.
31
31
32
-
-**Code Generation**: This is the final stage where the compiler transforms the optimized intermediate representation (IR) into
32
+
▶️**Code Generation**: This is the final stage where the compiler transforms the optimized intermediate representation (IR) into
33
33
machine code specific to the target architecture.
34
34
35
35
This library represent **Semantic Analysis** stage.
36
36
37
-
### Features
37
+
### 🌻 Features
38
38
39
-
-**Name Binding and Scope Checking**: The analyzer verifies that all variables, constants, functions are declared before they're used,
39
+
✅**Name Binding and Scope Checking**: The analyzer verifies that all variables, constants, functions are declared before they're used,
40
40
and that they're used within their scope. It also checks for name collisions, where variables, constants, functions, types in the same scope have the same name.
41
41
42
-
-**Checking Function Calls**: The analyzer verifies that functions are called with the number of parameters and that the type of
42
+
✅**Checking Function Calls**: The analyzer verifies that functions are called with the number of parameters and that the type of
43
43
arguments matches the type expected by the function.
44
44
45
-
-**Scope Rules**: Checks that variables, functions, constants, types are used within their scope, and available in the visibility scope.
45
+
✅**Scope Rules**: Checks that variables, functions, constants, types are used within their scope, and available in the visibility scope.
46
46
47
-
-**Type Checking**: The analyzer checks that operations are performed on compatible types for expressions, functions, constant, bindings.
47
+
✅**Type Checking**: The analyzer checks that operations are performed on compatible types for expressions, functions, constant, bindings.
48
48
For operations in expressions. It is the process of verifying that the types of expressions are consistent with their usage in the context.
49
49
50
-
-**Flow Control Checking**: The analyzer checks that the control flow statements (if-else, loop, return, break, continue) are used correctly.
50
+
✅**Flow Control Checking**: The analyzer checks that the control flow statements (if-else, loop, return, break, continue) are used correctly.
51
51
Supported condition expressions and condition expression correctness check.
52
52
53
-
-**Building the Symbol Table**: For analyzing used the symbol table as data structure used by the semantic analyzer to keep track of
53
+
✅**Building the Symbol Table**: For analyzing used the symbol table as data structure used by the semantic analyzer to keep track of
54
54
symbols (variables, functions, constants) in the source code. Each entry in the symbol table contains the symbol's name, type, and scope related for block state, and other relevant information.
55
55
56
-
### Semantic State Tree
56
+
### 🌳 Semantic State Tree
57
57
58
58
The result of executing and passing stages of the semantic analyzer is: **Semantic State Tree**.
59
59
60
60
This can be used for Intermediate Code Generation, for further passes
61
61
semantic tree optimizations, linting, backend codegen (like LLVM) to target machine.
62
62
63
-
#### Structure of Semantic State Tree
63
+
#### 🌲 Structure of Semantic State Tree
64
64
65
65
-**blocks state** and related block state child branches. It's a basic
66
66
entity for scopes: variables, blocks (function, if, loop).
@@ -87,7 +87,7 @@ However, parent elements cannot access child elements, which effectively limits
87
87
88
88
All of that source data, that can be used for Intermediate Representation for next optimizations and compilers codegen.
89
89
90
-
### Subset of programming languages
90
+
### 🧺 Subset of programming languages
91
91
92
92
The input parameter for the analyzer is a predefined
93
93
AST (abstract syntax tree). As a library for building AST and the only dependency
@@ -104,4 +104,12 @@ analysis and source code parsing, it is recommended to use: [nom is a parser com
104
104
105
105
AST displays the **Turing complete** programming language and contains all the necessary elements for this.
106
106
107
+
## 🛋️ Examples
108
+
109
+
- 🔎 There is the example implementation separate project [💾 Toy Codegen](https://github.com/mrLSD/toy-codegen).
110
+
The project uses the `SemanticStack` results and converts them into **Code Generation** logic. Which clearly shows the
111
+
possibilities of using the results of the `semantic-analyzer-rs``SemanticStackContext` results. LLVM is used as a
112
+
backend, [inkwell](https://github.com/TheDan64/inkwell) as a library for LLVM codegen, and compiled into an executable
113
+
program. The source of data is the AST structure itself.
0 commit comments