Please check out the wiki for more documentation and lessons learned!
The main purpose of this project is to learn more about compiler. Therefore, I try my best to create a good knowledge base in the wiki that not only documents the code but also the lessons I've learned in the process. I hope this project could be helpful for others in the future, especially those who want to use LLVM C API but struggle with the lack of documentation from the main website.
Final Project for Compiler Design class in Spring 2018. The project is to build a simple recursive decent with one lookahead (LL(1)) compiler from scratch (without using external compiler frontend tools such as flex or bison). The file projectDescription.pdf gives more details about the project.
This compiler compiles a made-up language. The file projectLanguageDescription.pdf in the root folder specifies the grammar and semantics of the language. Some example code in this language can be found in code_gen/tests folder.
- Scanner: Implemented in the
scannerfolder - Parser: the
parserfolder. - Type Checking: the
symbol_table,semanticsandtype_checkingfolders. - Code Generation: the
code_genfolder (current development). - Runtime: To be implemented. However, the compiler can still with
llitool.
Future goals: After this project, I'm hoping to learn more about compiler optimization.
Currently, the compiler is at code generation phase. In order to run, go into code_gen folder and type:
$ make
clang++ main.o error.o parser.o reader.o scanner.o semantics.o symbol_table.o token.o code_gen.o `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native executionengine` -o codegen.out
This will generate codegen.out file, the binary code file for the compiler. The compiler takes in one command line parameter: the code file to compile. To compile a file, do like so:
$ ./codegen.out tests/test-proc.src
Replace tests/test-proc.src with a path to a test file. Some example tests can be found in the tests folder.
In order to see LLVM IR (almost like assembly), run:
$ make codegen.ll
This rule will use llvm-dis to generate codegen.ll file, which contains LLVM IR code.
Debugging information is displayed by calling:
void assert_scanner(const char *mesg);
void assert_parser(const char *mesg);
void assert_symbol_table(const char *mesg);
void assert_semantics(const char *mesg);
void assert_codegen(const char *mesg);
These functions are declared in the code_gen/error.h file. Therefore, in order to turn on or off debugging info, uncomment/comment their implementations in the file code_gen/error.c.
Feel free to create issues and make pull requests :)