Skip to content

Commit

Permalink
fix(LSQ): modify misaligned forward fault detection (#4038)
Browse files Browse the repository at this point in the history
Previously, I used an inappropriate way for another misalign to trigger
a `forward fault`:


https://github.com/OpenXiangShan/XiangShan/blob/38d0d7c5a34a23dfdb58a3cb2737c3cfddb3ec9d/src/main/scala/xiangshan/mem/lsqueue/StoreQueue.scala#L684-L711

This would cause the `BlockSqIdx` passed to `LoadQueueReplay` to use the
`sqIdx` from `uop` instead of the `sqIdx` with the unalign flag bit:

https://github.com/OpenXiangShan/XiangShan/blob/38d0d7c5a34a23dfdb58a3cb2737c3cfddb3ec9d/src/main/scala/xiangshan/mem/lsqueue/StoreQueue.scala#L776-L782

**This leads to a possible stuck in `LoadQueueReplay`.**

And to resolve the stuck, we incorrectly introduced this
Commit(af757d1).

This Commit(af757d1) causes
`BlockSqIdx` to unblock without `DataValid`.
This leads to certain performance issues.

This revision fixes the inappropriate `forward fault` triggering method
and reverses the Commit(af757d1).

**This should bring performance back up again.**
### Apologies for my mistake.
  • Loading branch information
Anzooooo authored Dec 16, 2024
1 parent 4fb7cc1 commit 909ea13
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 11 deletions.
4 changes: 2 additions & 2 deletions src/main/scala/xiangshan/mem/lsqueue/LoadQueueReplay.scala
Original file line number Diff line number Diff line change
Expand Up @@ -305,8 +305,8 @@ class LoadQueueReplay(implicit p: Parameters) extends XSModule
for (i <- 0 until LoadQueueReplaySize) {
// dequeue
// FIXME: store*Ptr is not accurate
dataNotBlockVec(i) := isNotBefore(io.stDataReadySqPtr, blockSqIdx(i)) || stDataReadyVec(blockSqIdx(i).value) || io.sqEmpty // for better timing
addrNotBlockVec(i) := isNotBefore(io.stAddrReadySqPtr, blockSqIdx(i)) || !strict(i) && stAddrReadyVec(blockSqIdx(i).value) || io.sqEmpty // for better timing
dataNotBlockVec(i) := isAfter(io.stDataReadySqPtr, blockSqIdx(i)) || stDataReadyVec(blockSqIdx(i).value) || io.sqEmpty // for better timing
addrNotBlockVec(i) := isAfter(io.stAddrReadySqPtr, blockSqIdx(i)) || !strict(i) && stAddrReadyVec(blockSqIdx(i).value) || io.sqEmpty // for better timing
// store address execute
storeAddrInSameCycleVec(i) := VecInit((0 until StorePipelineWidth).map(w => {
io.storeAddrIn(w).valid &&
Expand Down
15 changes: 6 additions & 9 deletions src/main/scala/xiangshan/mem/lsqueue/StoreQueue.scala
Original file line number Diff line number Diff line change
Expand Up @@ -674,19 +674,16 @@ class StoreQueue(implicit p: Parameters) extends XSModule
// Forward result will be generated 1 cycle later (load_s2)
io.forward(i).forwardMask := dataModule.io.forwardMask(i)
io.forward(i).forwardData := dataModule.io.forwardData(i)

//TODO If the previous store appears out of alignment, then simply FF, this is a very unreasonable way to do it.
//TODO But for the time being, this is the way to ensure correctness. Such a suitable opportunity to support unaligned forward.
// If addr match, data not ready, mark it as dataInvalid
// load_s1: generate dataInvalid in load_s1 to set fastUop
val dataInvalidMask1 = (addrValidVec.asUInt & ~dataValidVec.asUInt & vaddrModule.io.forwardMmask(i).asUInt & forwardMask1.asUInt)
val dataInvalidMask2 = (addrValidVec.asUInt & ~dataValidVec.asUInt & vaddrModule.io.forwardMmask(i).asUInt & forwardMask2.asUInt)
val dataInvalidMask1 = ((addrValidVec.asUInt & ~dataValidVec.asUInt & vaddrModule.io.forwardMmask(i).asUInt) | unaligned.asUInt & allocated.asUInt) & forwardMask1.asUInt
val dataInvalidMask2 = ((addrValidVec.asUInt & ~dataValidVec.asUInt & vaddrModule.io.forwardMmask(i).asUInt) | unaligned.asUInt & allocated.asUInt) & forwardMask2.asUInt
val dataInvalidMask = dataInvalidMask1 | dataInvalidMask2
io.forward(i).dataInvalidFast := dataInvalidMask.orR

//TODO If the previous store appears out of alignment, then simply FF, this is a very unreasonable way to do it.
//TODO But for the time being, this is the way to ensure correctness. Such a suitable opportunity to support unaligned forward.
val unalignedMask1 = unaligned.asUInt & forwardMask1.asUInt & allocated.asUInt
val unalignedMask2 = unaligned.asUInt & forwardMask2.asUInt & allocated.asUInt
val forwardPreWithUnaligned = (unalignedMask1 | unalignedMask2).asUInt.orR

// make chisel happy
val dataInvalidMask1Reg = Wire(UInt(StoreQueueSize.W))
dataInvalidMask1Reg := RegNext(dataInvalidMask1)
Expand All @@ -708,7 +705,7 @@ class StoreQueue(implicit p: Parameters) extends XSModule
val addrInvalidMaskReg = addrInvalidMask1Reg | addrInvalidMask2Reg

// load_s2
io.forward(i).dataInvalid := RegNext(io.forward(i).dataInvalidFast) || RegNext(forwardPreWithUnaligned)
io.forward(i).dataInvalid := RegNext(io.forward(i).dataInvalidFast)
// check if vaddr forward mismatched
io.forward(i).matchInvalid := vaddrMatchFailed

Expand Down

0 comments on commit 909ea13

Please sign in to comment.