Skip to content

Commit

Permalink
[Phase4] Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
liqwang committed Jan 14, 2024
1 parent 148c2da commit d8d6a5b
Showing 1 changed file with 123 additions and 35 deletions.
158 changes: 123 additions & 35 deletions phase4/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,51 +6,90 @@

**12111611 Ruixiang Jiang**

## How to Run and Test
1. make splc: `make`
2. auto test: `make test`

## Design
### Language
Use python to implement make splc with pyinstaller
### Procedure Call
before the function is called:
1. Store Active Variables in the stack
2. jal in the target function
3. Store ra in the stack

after the function is called:
1. get ra in the stack
2. the result is stored in the stack
3. the Active Variables are restored from the stack

### Translation processing
check if the three address code has numbers:
## Language
Instead of using the provided C starter code, we use **Python** to implement the compiler, which is easy to develop

Use **PyInstaller** to compile `splc.py` to executable binary file `bin/splc`, and the provided judging environment has already installed `pyinstaller`

```makefile
#Makefile
splc: splc.py
pyinstaller --log-level WARN --distpath bin -F splc.py
```



## Translation

`translate()` is the core function to translate TAC to MIPS32 assembly code, and we use regex to match the TAC's pattern type

```python
def translate(tac: str) -> "list[str]":
id = '[^\\d#*]\\w*'
num = '#\\d+'
if re.fullmatch(f'{id} := {num}', tac): # x := #k
x, k = tac.split(' := #')
command.append(f'li {reg(x)}, {k}')
if re.fullmatch(f'{id} := {id}', tac): # x := y
x, y = tac.split(' := ')
command.append(f'move {reg(x)}, {reg(y)}')
...
```

Besides the basic translation scheme, we also do the following work:

Check if the three address code has numbers:
1. Support the return value and input as numbers
2. Support multiplication and division containing numbers
3. The positions of registers and numbers can be switched arbitrarily (x := y - #k or x := #y - k is OK)
3. The positions of registers and numbers can be switched arbitrarily (`x := y - #k` or `x := #y - k` is OK)

More efficient:
**Efficiency**:

1. If there are no unknowns in addition, subtraction, multiplication, and division (that is, all numbers), the result is directly assigned to the target register
2. For commands that use a stack Variable, the variable in the stack is moved to the register using `lw`. Similarly, the variable in the stack can be identified by `sw` to the corresponding position
- If there are no unknowns in addition, subtraction, multiplication, and division (that is, all numbers), the result is directly assigned to the target register

More secure:
- For commands that use a stack variable, the variable in the stack is moved to the register using `lw`.

In mips, an addi number greater than 32767 or less than -32767 causes an arithmetic overflow, which we detected and addressed (see test15).
Similarly, the variable in the stack can be identified by `sw` to the corresponding position

Easier to read:
**Security**:

At the beginning, jal directly jumps to main, obtains the return value of main function and then ends, without adjusting the position of the three-address code function
If the integers are not in the contraint range $[−2^{16}, 2^{16}−1)$, we can't translate them directly to MIPS32 immediates. So we do the additional translation logic like below, to avoid Arithmetic Overflow error in SPIM simulator

### Variables
We use specific registers `s6` for reading and writing and a portion of the stack space for storing common variables
```python
def translate(tac: str) -> "list[str]":
...
if re.fullmatch(fr'{id} := {num} \+ {id}', tac): # x := #y + k
x, y, k = re.split(r' := #| \+ ', tac)
n = int(y)
while n > 32767:
command.append(f'addi {reg(k)}, {reg(k)}, {32767}')
n = n - 32767
command.append(f'addi {reg(x)}, {reg(k)}, {n}')
```

Have tried it on multiple platforms, websites, and can run on Mars and various versions of spim

#### Active Variables

Before function call, we only need to store the active variables before invoking the function
## Procedure Call

Before the function is called:

1. Store Active Variables in the stack
2. `jal` in the target function
3. Store `$ra` in the stack

After the function is called:

1. Get `$ra` in the stack
2. The result is stored in the stack
3. The Active Variables are restored from the stack



### Active Variables

Before function invokation, we only need to store the active variables before invoking the function

The active variables are stored in the set `active_vars`, which is easy to maintain

Expand All @@ -60,15 +99,64 @@ The set will be updated only in three cases:
- `PARAM x`
- `READ x`

We only need to add `x` to `active_vars` while translating the TACs
We only need to add `x` to `active_vars` while translating the TAC

When we get TACs like `IF x [op] y GOTO label`, we **don't** need to add `x` and `y` to `active_vars`, because `x` and `y` must be already declared in the form of the above three cases

Finally, clear the set after getting a new function TAC `FUNCTION f :`

### Some flexibility:

If you need more than one register for the output of floating point numbers, you can adjust the special register by adjusting the number of pre-stored registers, and move the registers used for all operations back (The normally used register $11 can easily be changed to $12)
```python
active_vars = set() # the active variables in current function, need to be stored to memory before invoking subfunction
def translate(tac: str) -> "list[str]":
if re.fullmatch(f'{id} := .+', tac): # x := ...
x, _ = tac.split(' := ')
active_vars.add(x)
if re.fullmatch(f'PARAM {id}', tac): # PARAM x
x = tac.split(' ')[1]
active_vars.add(x)
if re.fullmatch(f'READ {id}', tac): # READ x
x = tac.split(' ')[1]
active_vars.add(x)
if re.fullmatch(f'FUNCTION {id} :', tac): # FUNCTION f :
active_vars = set() # start new function's active variables
```



## Register Allocation

For register allocation, we use a trivial strategy. If there isn't avaliable register, the variable will be stored to the stack

```python
# `var` can be variable or integer
def reg(var: str) -> str:
try:
if var == '0':
return '$0'
int(var) #check integer
return var #directly return integer
except: #variable case
reg = find_reg(var)
if reg:
return reg
# If there isn't avaliable register, the variable will be stored to the stack
stack.append(var)
return f'{stack.index(var) + max_var_num << 2}($sp)'

def find_reg(var: str) -> str:
if var in register_table:
return f'${register_table.index(var) + save_reg}'
elif var in stack:
return f'{(stack.index(var) + max_var_num) << 2}($sp)'
else: # No avaliable register
return None
```



## Flexibility

If we need more than one register for the output of floating point numbers, we can adjust the special register by adjusting the number of pre-stored registers, and move the registers used for all operations back (the normally used register `$11` can easily be changed to `$12`)



0 comments on commit d8d6a5b

Please sign in to comment.