-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WASM float to int performance regression since 1.53.0 #87643
Comments
Though looking at the LLVM IR it seems like the difference is that 1.52.0 is our own rustc saturation implementation and since 1.53 it's LLVM 12's saturation implementation, which seems to be worse? |
Thanks for the heads up, this is behaving as "expected" although the expectation wasn't really thoroughly evaluated by me. As you've discovered the main difference (which you can see in that diff view with In that sense codegen is behaving as expected, and I believe at the time I diff'd the two (rustc's old codegen and LLVM's built-in intrinsic codegen) and saw they were different but assume that the difference was negligible. Have you measured the LLVM-intrinsic-generated code to have worse performance? (I see it has a few extra instructions but I'd be curious to put concrete numbers on it if possible) If LLVM has worse performance I think it'd be good to open an issue upstream with them and see if we can improve it upstream, but if it's critical and it's too difficult to land upstream then we can perhaps re-land the wasm-specific bits for rustc. (it's also be best if Safari implemented the nontrapping-fptoint extensions so we could consider turning that on by default...) |
I didn't do any benchmarks, but it seems like it suffers from the same problem as our original codegen. This is roughly the new WASM translated into Rusty pseudo code: // General ifs to do the saturation
if x.is_nan() {
0
} else if x >= 0x1.fffffep30 {
2147483647
} else if x >= -0x1p31 {
// Protection against trapping
if x.abs() < 0x1p31 {
(int)x
} else {
-2147483648
}
} else {
-2147483648
} This protection against trapping shouldn't be there, the saturation code already checked for all the edge cases (though technically a lot of these are selects, which they can't be if you remove the protection code, which may have some performance implications?). This is probably because this lowering of the saturation casts is backend independent and yet again the WASM backend doesn't know anything about it, so it still protects itself from the trapping. So yeah this definitely can be improved. I'll look into raising an upstream issue I guess. It's not critical at all, it's just something that we stumbled upon in some Twitter discussion. |
Don't you need to opt-in on the wasm side with
(of course it doesn't change that without this feature, there's a regression) |
Yeah this issue is only concerned with not having the WASM feature active. It should still produce reasonably good code then. Atm it's not as good as before 1.53. |
Both Rust and WASM introduced saturating float to int casts. However WASM originally only had trapping float to int casts. LLVMs internal float casts are speculatable, i.e. it can execute them early, assuming no trap ever happens. This however means LLVM needs to protect itself from WASM's trapping casts by emitting some more code around it. Once Rust introduced saturating float to int casts, rustc itself started emitting a bunch of code around the casts to saturate the values. This then led to both rustc and LLVM emitting this guard code around each cast. However since rustc already protected itself some dangerous values, LLVM didn't need to emit any of these additional instructions. This was eventually implemented. Check this previous issues and related PRs: #73591
However with the switch to LLVM 12, it was possible to throw out a lot of the manual codegen in rustc:
#84339 which acknowledges a regression in those casts
and then a follow up PR:
#84654 which supposedly fixes the regression
However it seems like there's still a regression: https://rust.godbolt.org/z/W18vGcv9T
cc @alexcrichton
The text was updated successfully, but these errors were encountered: