Translator

Java translator for code written in an arbitrary 'P' language following a LL(1) defined grammar.

Project description

The aim of the project is to develop a Java translator for code written in an arbitrary 'P' language.

Components needed to build a translator:

Lexical scanner

The job of this component is to read a text sequence and produce the associated list of token; a token correspond to a lexical unit like a number, an ID, a relational operator, keyword etc. Token of the language are shown in the following table:

Token	Pattern	Name
Numbers	Numeric constant	256
Identifier	Letter followed by letters and numbers	257
Relop	Relational operator (<,>,<=,>=,==,<>)	258
Assignment	assign	259
To	to	260
If	if	261
Else	else	262
While	while	263
Begin	begin	264
End	end	265
Print	print	266
Read	read	267
Disjunction	//	268
Conjunction	&&	269
Negation	!	33
Parentheses left	(	40
Parentheses right	)	41
Brace left	{	123
Brace right	}	125
Addition	+	43
Subtraction	-	45
Multiplication	*	42
Division	/	47
Semicolon	;	59
Colon	,	44
EOF	end of input	-1

Identifiers correspond to the following regular expression:

(a + ... + z + A + ... + Z)(a + ... + z + A + ... + Z + 0 + ... + 9)*

while numbers correspond to the following regular expression:

0 + (1 + ... + 9)(0 + ... + 9)*

The lexical scanner must ignore all white space characters, single and multiple line Java comments, but should report illegal characters like '#' or '@'. The output of the lexical scanner must be in the form of ... .

For example, for input "assign 300 to d" the lexical scanner output is:

<259,assign> <256,300> <260,to> <257,d> <59> <-1>

The lexical scanner is not able to recognize the structure of commands, like 5+;) or (34+26( - (2+15-( 27. These once will be accepted by the lexical scanner.

Parser top down

This component is the deterministic "recognizer" of strings generated by the given grammar. It is based on the following productions set:

Legend

P stands for < prog >
S stands for < statlist >
S' stands for < statlistp >
S" stands for < stat >
S"' stands for < statp >
I stands for < idlist >
I' stands for < idlistp >
B stands for < bexpr >
E stands for < expr >
E' stands for < exprlist >
E" stands for < exprlistp >

Productions

P --> S EOF
S --> S" S'
S' --> ; S" S' | ε
S" --> assign E to I
S" --> print ( E' )
S" --> read ( I )
S" --> while ( B ) S"
S" --> if ( B ) S" S"'
S" --> { S }
S"' --> end | else S" end
I --> ID I'
I' --> , ID I' | ε
B --> RELOP E E
E --> + ( E' ) | - E E
E --> * ( E' ) | / E E
E --> NUM | ID
E' --> E E"
E" --> , E E" | ε

Valutator

This component is based on the SDT built following the LL(1) grammar.

Translator

This component translate programs written in a simple programming language (called P language). Programs written in the P language have the .lft extension.

The translator must generate bytecode that can be executed directly by the JVM. Generating bytecode that can be directly executed by the JVM is not a simple operation, due to the complexity of the .class file (binary format). For this reason the bytecode will be generated through the use of a mnemonic language (https://en.wikipedia.org/wiki/List_of_Java_bytecode_instructions) that refers to the JVM assembly instructions and, subsequently, it can be translated into the .class format by the assembler program.

The assembler program does a 1:1 translation of the mnemonic instructions into the corresponding JVM instructions (opcode). The assembler program used in this project is called Jasmin (http://jasmin.sourceforge.net/). The translation scheme is the following:

The source code is translated by the compiler into the JVM assembler language (.j extension); after that it is translated into the .class file by Jasmin assembler program. The compiling process generates an Output.j file and the following command is used to convert it in the Output.class file:

java -jar jasmin.jar Output.j

that can be executed with the following command:

java Output

The final output should be an Output.j file, containing header, mnemonic instruction list and footer. The output is generated according to the test file given read as input in the main of the Traslator class. As an example, if the P language code to translate is the following:

read (a);
print(+(a,1))

the correspondent Output.j file will contains the following:

invokestatic Output/read()I
istore 0
goto L1
L1:
iload 0
ldc 1
iadd
invokestatic Output/print(I)V
goto L2
L2:
goto L0
L0:

Tools and languages

Java 17
IntelliJ Idea Ultimate
Jasmin
Gradle
GitHub Actions

Contributing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Translator

Table of Contents

Project description

Lexical scanner

Parser top down

Valutator

Translator

Tools and languages

Contributing

Files

README.md

Latest commit

History

README.md

File metadata and controls

Translator

Table of Contents

Project description

Lexical scanner

Parser top down

Valutator

Translator

Tools and languages

Contributing