-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AArch64 SVE2 Vector Length Agnostic builders #317
Conversation
The first 2 commits of this are #311. Please review only the last commit here. @maxim-kuvyrkov please confirm that this is exactly what was requested. |
682c0c8
to
bc0a174
Compare
I've also confirmed that at least right now, these configurations are green. They will be silent to begin with anyway, so that we can work out any capacity issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the parallelism comment -- looks great. Thanks!
This looks good to me overall. just one minor nit inline. |
These generate code that uses the SVE2 extension, as opposed to our existing builders that only use SVE. Therefore they will run on our Graviton 4 workers only, which are Neoverse v2 cores with SVE and SVE2. The optimisation guide: https://github.com/aws/aws-graviton-getting-started/blob/main/c-c++.md Suggests -mcpu=neoverse-v2 for best performance, but here I'm using it just so we get the SVE2 feature enabled. If we used the balanced -mcpu=neoverse-512tvb, we wouldn't get SVE2 without fiddly extra flags. ``` $ ~/clang+llvm-18.1.8-aarch64-linux-gnu/bin/clang /tmp/test.c -o /dev/null -mcpu=neoverse-v2 -### <...> "-target-feature" "+sve" "-target-feature" "+sve2-bitperm" "-target-feature" "+sve2" <...> ``` I have not added vector length settings as these builders will generate scalable ("length agnostic" - LA) code. Otherwise, the builders have identical setups to the existing SVE VLA builders.
26e7ee1
to
f1a9b25
Compare
f1a9b25
to
8be6daa
Compare
CI failure is due to an unrelated change, merging. |
These generate code that uses the SVE2 extension, as opposed to our existing builders that only use SVE. Therefore they will run on our Graviton 4 workers only, which are Neoverse v2 cores with SVE and SVE2.
The optimisation guide:
https://github.com/aws/aws-graviton-getting-started/blob/main/c-c++.md
Suggests -mcpu=neoverse-v2 for best performance, but here I'm using it just so we get the SVE2 feature enabled. If we used the balanced -mcpu=neoverse-512tvb, we wouldn't get SVE2 without fiddly extra flags.
I have not added vector length settings as these builders will generate scalable ("length agnostic" - LA) code.
Otherwise, the builders have identical setups to the existing SVE VLA builders.