Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decompose large simdblockwrite to smaller simdblockwrites #2227

Merged
merged 14 commits into from
Sep 13, 2024

Conversation

whitneywhtsang
Copy link
Contributor

Similar to #2193, but for simdblockwrite.
Decompose simdblockwrite of vector size > 8 to a number of simdblockwrites of vector size 8.
e.g., <64xi16> simdblockwrite is not supported by OpenCL C builtins, decompose it to 8 x <8xi16>.
Restrict TritonGEN::SIMDBlockWriteOp to only accept vector types that are allowed by OpenCL C builtins.

@whitneywhtsang whitneywhtsang self-assigned this Sep 13, 2024
@whitneywhtsang whitneywhtsang linked an issue Sep 13, 2024 that may be closed by this pull request
@whitneywhtsang
Copy link
Contributor Author

whitneywhtsang commented Sep 13, 2024

CIs:

=> Results within the margin of error.

@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/splitsimdblockwrite branch 4 times, most recently from beed1bd to 128861d Compare September 13, 2024 04:10
@whitneywhtsang whitneywhtsang marked this pull request as ready for review September 13, 2024 04:11
Copy link
Contributor

@victor-eds victor-eds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just a minor NIT

Base automatically changed from whitneywhtsang/splitsimdblock to llvm-target September 13, 2024 16:14
@etiotto etiotto merged commit 3fc58df into llvm-target Sep 13, 2024
4 checks passed
@etiotto etiotto deleted the whitneywhtsang/splitsimdblockwrite branch September 13, 2024 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Decompose large simdblockwrite to smaller simdblockwrites
3 participants