diff --git a/phase4/README.md b/phase4/README.md index 30348e4..d9fd8fd 100644 --- a/phase4/README.md +++ b/phase4/README.md @@ -6,51 +6,90 @@ **12111611 Ruixiang Jiang** -## How to Run and Test -1. make splc: `make` -2. auto test: `make test` - -## Design -### Language -Use python to implement make splc with pyinstaller -### Procedure Call -before the function is called: -1. Store Active Variables in the stack -2. jal in the target function -3. Store ra in the stack -after the function is called: -1. get ra in the stack -2. the result is stored in the stack -3. the Active Variables are restored from the stack -### Translation processing -check if the three address code has numbers: +## Language +Instead of using the provided C starter code, we use **Python** to implement the compiler, which is easy to develop + +Use **PyInstaller** to compile `splc.py` to executable binary file `bin/splc`, and the provided judging environment has already installed `pyinstaller` + +```makefile +#Makefile +splc: splc.py + pyinstaller --log-level WARN --distpath bin -F splc.py +``` + + + +## Translation + +`translate()` is the core function to translate TAC to MIPS32 assembly code, and we use regex to match the TAC's pattern type + +```python +def translate(tac: str) -> "list[str]": + id = '[^\\d#*]\\w*' + num = '#\\d+' + if re.fullmatch(f'{id} := {num}', tac): # x := #k + x, k = tac.split(' := #') + command.append(f'li {reg(x)}, {k}') + if re.fullmatch(f'{id} := {id}', tac): # x := y + x, y = tac.split(' := ') + command.append(f'move {reg(x)}, {reg(y)}') + ... +``` + +Besides the basic translation scheme, we also do the following work: + +Check if the three address code has numbers: 1. Support the return value and input as numbers 2. Support multiplication and division containing numbers -3. The positions of registers and numbers can be switched arbitrarily (x := y - #k or x := #y - k is OK) +3. The positions of registers and numbers can be switched arbitrarily (`x := y - #k` or `x := #y - k` is OK) -More efficient: +**Efficiency**: -1. If there are no unknowns in addition, subtraction, multiplication, and division (that is, all numbers), the result is directly assigned to the target register -2. For commands that use a stack Variable, the variable in the stack is moved to the register using `lw`. Similarly, the variable in the stack can be identified by `sw` to the corresponding position +- If there are no unknowns in addition, subtraction, multiplication, and division (that is, all numbers), the result is directly assigned to the target register -More secure: +- For commands that use a stack variable, the variable in the stack is moved to the register using `lw`. -In mips, an addi number greater than 32767 or less than -32767 causes an arithmetic overflow, which we detected and addressed (see test15). + Similarly, the variable in the stack can be identified by `sw` to the corresponding position -Easier to read: +**Security**: -At the beginning, jal directly jumps to main, obtains the return value of main function and then ends, without adjusting the position of the three-address code function +If the integers are not in the contraint range $[−2^{16}, 2^{16}−1)$, we can't translate them directly to MIPS32 immediates. So we do the additional translation logic like below, to avoid Arithmetic Overflow error in SPIM simulator -### Variables -We use specific registers `s6` for reading and writing and a portion of the stack space for storing common variables +```python +def translate(tac: str) -> "list[str]": + ... + if re.fullmatch(fr'{id} := {num} \+ {id}', tac): # x := #y + k + x, y, k = re.split(r' := #| \+ ', tac) + n = int(y) + while n > 32767: + command.append(f'addi {reg(k)}, {reg(k)}, {32767}') + n = n - 32767 + command.append(f'addi {reg(x)}, {reg(k)}, {n}') +``` -Have tried it on multiple platforms, websites, and can run on Mars and various versions of spim -#### Active Variables -Before function call, we only need to store the active variables before invoking the function +## Procedure Call + +Before the function is called: + +1. Store Active Variables in the stack +2. `jal` in the target function +3. Store `$ra` in the stack + +After the function is called: + +1. Get `$ra` in the stack +2. The result is stored in the stack +3. The Active Variables are restored from the stack + + + +### Active Variables + +Before function invokation, we only need to store the active variables before invoking the function The active variables are stored in the set `active_vars`, which is easy to maintain @@ -60,15 +99,64 @@ The set will be updated only in three cases: - `PARAM x` - `READ x` -We only need to add `x` to `active_vars` while translating the TACs +We only need to add `x` to `active_vars` while translating the TAC When we get TACs like `IF x [op] y GOTO label`, we **don't** need to add `x` and `y` to `active_vars`, because `x` and `y` must be already declared in the form of the above three cases Finally, clear the set after getting a new function TAC `FUNCTION f :` -### Some flexibility: - -If you need more than one register for the output of floating point numbers, you can adjust the special register by adjusting the number of pre-stored registers, and move the registers used for all operations back (The normally used register $11 can easily be changed to $12) +```python +active_vars = set() # the active variables in current function, need to be stored to memory before invoking subfunction +def translate(tac: str) -> "list[str]": + if re.fullmatch(f'{id} := .+', tac): # x := ... + x, _ = tac.split(' := ') + active_vars.add(x) + if re.fullmatch(f'PARAM {id}', tac): # PARAM x + x = tac.split(' ')[1] + active_vars.add(x) + if re.fullmatch(f'READ {id}', tac): # READ x + x = tac.split(' ')[1] + active_vars.add(x) + if re.fullmatch(f'FUNCTION {id} :', tac): # FUNCTION f : + active_vars = set() # start new function's active variables +``` + + + +## Register Allocation + +For register allocation, we use a trivial strategy. If there isn't avaliable register, the variable will be stored to the stack + +```python +# `var` can be variable or integer +def reg(var: str) -> str: + try: + if var == '0': + return '$0' + int(var) #check integer + return var #directly return integer + except: #variable case + reg = find_reg(var) + if reg: + return reg + # If there isn't avaliable register, the variable will be stored to the stack + stack.append(var) + return f'{stack.index(var) + max_var_num << 2}($sp)' + +def find_reg(var: str) -> str: + if var in register_table: + return f'${register_table.index(var) + save_reg}' + elif var in stack: + return f'{(stack.index(var) + max_var_num) << 2}($sp)' + else: # No avaliable register + return None +``` + + + +## Flexibility + +If we need more than one register for the output of floating point numbers, we can adjust the special register by adjusting the number of pre-stored registers, and move the registers used for all operations back (the normally used register `$11` can easily be changed to `$12`)