-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ci flwadd #514
base: ci_bigblade
Are you sure you want to change the base?
Ci flwadd #514
Conversation
* Move wormhole concentrator outside pod ruche array * merge north and south vcaches traffic extend cid to 2-bit
* clean up branch trace * remote interrupt interface * npc_r * add mret encoding * fcsr load goes to rs2_val * interrupt csr test * add mcsr * interrupt trace test * set pc_init_val by _start; * interrupt remote test * remote interrrupt icache miss * trace interrupt icache miss test * CR * change remote interrupt eva * interrupt trace countdown * printing interrupt taken * interrupt trace float * mret display; debug_p * more test; remote interrupt ptr * interrupt trace jump loop * interrupt trace jump loop icache * interrupt dual source * interrupt trace branch loop icache * interrupt_trace_branch_mispredict_loop * remote load loop with interrupt
* io rtr tag client * set width_p for io rtr tag clients
* wh return fiof * fix
* Added reset_done port to dpi interface * Adds missing inputs to endpoint_to_fifos * Removed pod x/y * Adds CUDA kernel for DMA test * Adds missing file Co-authored-by: Dustin Richmond <[email protected]>
* temp commit * fix regression * offset dmem start addr by 8 bytes for interrupt handler * CR * cr * adding .dmem.interrupt section; declaring _interrupt_arr in crt.S; update hello to use interrupt arr using extern * add comment
* Adding trace interrupt test for idiv, fdiv and imul * Adding remote interrupt tests with multiplier and divider * Testing multiple remote interrupts, icache misses in handler and different handler return strategies * Adding a interrupt test regression * Deleting duplicates within the interrupt regression suite without contaminating author history * Using a macro for the start code * Makefile cleanup * More regression makefile cleanup * Modifying macro name * Adding npc mret test * Deleting duplicate files and noting original author in specific files * Adding interrupt tests to the no recurse list * Adding a mini threading test * Directed test exposing the need for PR #463 * Adding a readme * Minor makefile modifications * Adding missing test to regression
* Rewrite Victim Cache Profiler Parser (#428) * Fixed vcache profiler bugs with re-write * fix tile parser to get correct min cycle no. for absolute total cycles calculation when using multiple tags (#480) * Fix header_print_p (for bigblade) Bugs: * Order of operations: Python binds or higher than > * Counting of repeated tags. Previously, only the last tag was counted * Iteration-order aware tag window New Features: * Atomic Misses * Response Stalls * More documentation * Remove deprecated code * Small field name modifications * Fixed issue with mismatched tags error * Fixed issue where TG origin/dim were confused for Device origin/dim Co-authored-by: Emily Furst <[email protected]>
* imul mux order * fp_exe flush minimize toggles * add comment * add comment 2
refactor dram hash func
* io rtr * pod row refactor * move bsg_tag out of pod * remove unused port * clean up * move out wh ruche buffers * move out west ruche buffers
* num_clk_ports_p for subarray * fix x -> c * fix * add param in pod row
// Machine Format: | ||
// rs1 rs2 rd opcode | ||
// 0000000_?????_?????_111_?????_0000100 | ||
`define RV32_FLWADD_OP 7'b0000100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Pulpino supports a post-increment instruction in their compiler
// here: https://github.com/taylor-bsg/pulp-riscv-gcc/blame/master/gcc/config/riscv/riscv.md#L3663
// this can be used as a reference for adding compiler support
// for instructions with sideeffects to addresses.
|
||
// Load & Store | ||
logic is_load_op; // Op loads data from memory | ||
logic is_store_op; // Op stores data to memory | ||
logic is_load_op; // Op is lw or flw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should now add flwadd?
| (instruction_i.funct7 == `RV32_FCVT_S_F2I_FUN7)); // FCVT.W.S, FCVT.WU.S | ||
end | ||
`RV32_FLWADD_OP: begin | ||
decode_o.write_rd = (instruction_i.funct7 == 7'b0000000) & (instruction_i.funct3 == 3'b111) & (instruction_i.rs1 != '0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
presumably part of this redundant with the fact that we are in the RW32_FLWADD_OP case statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably can just add FLWADD_OP to standard set of cases at the top of the casez statement?
decode_o.write_rd = (instruction_i.funct7 == 7'b0000000) & (instruction_i.funct3 == 3'b111) & (instruction_i.rs1 != '0); | ||
end | ||
`RV32_SYSTEM: begin | ||
decode_o.write_rd = (instruction_i.rd != '0); // CSRRW, CSRRS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
combine with case above?
always_comb begin | ||
unique casez (instruction_i.op) | ||
`RV32_BRANCH, `RV32_STORE, `RV32_OP: begin | ||
decode_o.read_rs2 = 1'b1; | ||
end | ||
`RV32_FLWADD_OP: begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
combine with above?
decode_o.read_frs2 = 1'b0; | ||
decode_o.read_frs3 = 1'b0; | ||
decode_o.write_frd = 1'b1; | ||
decode_o.is_fp_op = 1'b0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment why FLWADD is not an fp op
@@ -97,7 +97,7 @@ module lsu | |||
|
|||
assign dmem_v_o = is_local_dmem_addr & | |||
(exe_decode_i.is_load_op | exe_decode_i.is_store_op | | |||
exe_decode_i.is_lr_op | exe_decode_i.is_lr_aq_op); | |||
exe_decode_i.is_lr_op | exe_decode_i.is_lr_aq_op | exe_decode_i.is_flwadd_op); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now is_load_op includes is_flwadd_op right?
@@ -152,7 +152,7 @@ module lsu | |||
|
|||
|
|||
assign remote_req_v_o = icache_miss_i | | |||
((exe_decode_i.is_load_op | exe_decode_i.is_store_op | exe_decode_i.is_amo_op) & ~is_local_dmem_addr); | |||
((exe_decode_i.is_load_op | exe_decode_i.is_store_op | exe_decode_i.is_amo_op | exe_decode_i.is_flwadd_op) & ~is_local_dmem_addr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above?
wire int_remote_load_in_exe = remote_req_in_exe & exe_r.decode.is_load_op & exe_r.decode.write_rd; | ||
wire float_remote_load_in_exe = remote_req_in_exe & exe_r.decode.is_load_op & exe_r.decode.write_frd; | ||
wire float_remote_load_in_exe = remote_req_in_exe & (exe_r.decode.is_load_op | exe_r.decode.is_flwadd_op) & exe_r.decode.write_frd; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_load_op now includes is_flwadd_op?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wire int_remote_load_in_exe = remote_req_in_exe & exe_r.decode.is_load_op & ~exe_r.decode.is_flwadd_op & exe_r.decode.write_rd;
((id_r.decode.read_frs1 & (id_rs1 == exe_r.instruction.rd) & exe_r.decode.write_frd) | ||
|(id_r.decode.read_frs2 & (id_rs2 == exe_r.instruction.rd) & exe_r.decode.write_frd) | ||
|(id_r.decode.read_frs3 & (id_rs3 == exe_r.instruction.rd) & exe_r.decode.write_frd)); | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like the above is redundant if we include flwadd in local_load_in_exe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High level feed back: the code is simpler if is_load includes flwadd; (which is currently the case, but the code does not reflect this)
No description provided.