Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tile size abstraction in tensor ptr TTGIR to LLVM lowering #3008

Open
alexbaden opened this issue Dec 14, 2024 · 0 comments
Open

Add tile size abstraction in tensor ptr TTGIR to LLVM lowering #3008

alexbaden opened this issue Dec 14, 2024 · 0 comments
Assignees
Labels
code quality enhancement New feature or request

Comments

@alexbaden
Copy link
Contributor

The code for lowering the tensor pointer load to TritonGen 2D Block Loads takes a parameterization of the 2D block tile size and permutes it based on properties of the DPAS layout and inputs to generate a final set of 2D block loads, shuffle the outputs, and pack/unpack the LLVM registers appropriately for the subsequent users. #3000 introduces some code duplication as the 2D block load for the DPAS layout is similar in many ways, but also quite different. To resolve this duplication and make the code easier to read, I am introducing a struct to keep track of the 2D block load parameters. The existing code in LoadStoreOpToLLVM.cpp will permute the struct instead of directly modifying the MLIR values. This will make the code easier to read, reduce duplication further between the DPAS layout and DotDpas layout, and allow us to easily dump debug information about the loads being generated, as the TritonGen loads are immediately lowered to SPIRV function calls and not written in any intermediate IR.

@alexbaden alexbaden self-assigned this Dec 14, 2024
@vlad-penkin vlad-penkin added the enhancement New feature or request label Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code quality enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants