You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently fsqrt.d is one of the most slow instructions that can be called from userspace, for instance it is roughly 3x slower than other floating point instructions, we could improve it. For dapps running untrusted RISC-V code, this function could be abused to make the dapp validation intentionally slower.
I added other instructions speed as reference. Also sqrt seems to be taking a large number of microarchitecture cycles.
EDIT: Seems like fdiv.d causes a iterations of 128 loops in uarch due to our 128bit implementation, we could also optimize that.
Possible solutions
Our current implementation is using Newton's method to find the square root, with many iterations. Seems like "Berkeley Softfloat" gets away without for loops, using fast invert square root, possible inspired by the famous Quake's fast invert square root. We could investigate how this is done, removing for loops would be the ideal case for running in microarchitecture.
The text was updated successfully, but these errors were encountered:
Context
Currently
fsqrt.d
is one of the most slow instructions that can be called from userspace, for instance it is roughly 3x slower than other floating point instructions, we could improve it. For dapps running untrusted RISC-V code, this function could be abused to make the dapp validation intentionally slower.Measurements:
I added other instructions speed as reference. Also sqrt seems to be taking a large number of microarchitecture cycles.
EDIT: Seems like
fdiv.d
causes a iterations of 128 loops in uarch due to our 128bit implementation, we could also optimize that.Possible solutions
Our current implementation is using Newton's method to find the square root, with many iterations. Seems like "Berkeley Softfloat" gets away without for loops, using fast invert square root, possible inspired by the famous Quake's fast invert square root. We could investigate how this is done, removing for loops would be the ideal case for running in microarchitecture.
The text was updated successfully, but these errors were encountered: