A standalone program -- written in C++ -- that reads a legal LOBO-C block from a filename supplied on the command line, or from standard input, then prints, to standard output, MIPS assembly code.
Is a custom C like language which is defined by a set of recursive rewriting rules (or productions) called Grammar. (This simpler C like language is created to reduce the complexity and facilitate construction of compiler from scratch.)
Grammar consists of the following components:
- a set of terminal symbols, which are the characters of the alphabet that appear in the strings generated by the grammar.
- a set of nonterminal symbols, which are placeholders for patterns of terminal symbols that can be generated by the nonterminal symbols.
- a set of productions, which are rules for replacing (or rewriting) nonterminal symbols (on the left side of the production) in a string with other nonterminal or terminal symbols (on the right side of the production).
- a START symbol, which is a special nonterminal symbol that appears in the initial string generated by the grammar.
The Grammar for LOBO-C is defined in this repository and can be found here. Following is an example of a valid LOBO-C code block which takes sorted list of signed integers from the user and runs binary-search on it.
signed target;
signed[10] values;
signed matchIndex;
signed i;
i = 0;
while ( i < 10 ) {
loboc >> values[i];
i++;
}
loboc >> target;
matchIndex = -1;
{
signed upperIndex;
signed lowerIndex;
signed middleIndex;
lowerIndex = 0;
upperIndex = 10 - 1;
while ( lowerIndex <= upperIndex ) {
middleIndex = ( lowerIndex + upperIndex ) / 2;
if ( values[middleIndex] < target ) {
lowerIndex = middleIndex + 1;
} else {
if ( values[middleIndex] == target ) {
matchIndex = middleIndex;
lowerIndex = upperIndex + 1;
} else {
upperIndex = middleIndex - 1;
}
}
}
}
loboc << matchIndex;
As it can be seen, the defined grammar allows us to have 'while', 'if', 'else' statements and even pointers in the LOBO-C language. See the folder 'tests' for more test cases.
Given a filename or code from standard input, the program performs all of the following operations sequentially.
-
Lexical Analysis
- Initialises a string buffer to read input.
- Converts characters to words to tokens like 'Keyword', 'Literal', 'Illegal character', 'Number', etc.
- Removes comments.
-
Syntactic Analysis (Parsing)
- Reads one token at a time from the Lexer and creates a parse tree or an abstract syntax tree according to the rules of the grammar.
- Any error in syntax is detected and reported at this stage.
-
Semantic Analysis
- Creates symbol tables for every block which keeps track of scope of all the variables inside the block.
- After parse tree with symbol tables is built, the following two operations are performed to catch and report any semantic errors like 'usage before declaration' or type errors, by walking down the parse tree.
- Type Checking
- Enforces the constraints of types.
- LOBO-C is a strongly-typed language.
- Check type-rules here.
- Constant Folding
- Recognises and evaluates constant expressions at compile time instead of runtime.
-
Code Generation
- One final parse-tree traversal is done to produce MIPS assembly code.
- Code is produced via greedy approach using only up to 3 registers without considering code optimisations.
make
- compiles and builds the source code.make realclean
- removes the object files and the executable.
-
./loboc <filename>
- reads the supplied loboc code and returns MIPS assembly code as standard output. -
spim -file <filename>
- reads assembly code and runs it on a simulation of MIPS architecture, a.k.a., SPIM.Example:
$> ./loboc tests/is5_bool.loboc > is5_bool.s $> spim -file is5_bool.s
Professor Ackley designed the specification and provided some test cases.
No known bugs. Please raise an issue in case any bugs are found.