You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The code for lowering the tensor pointer load to TritonGen 2D Block Loads takes a parameterization of the 2D block tile size and permutes it based on properties of the DPAS layout and inputs to generate a final set of 2D block loads, shuffle the outputs, and pack/unpack the LLVM registers appropriately for the subsequent users. #3000 introduces some code duplication as the 2D block load for the DPAS layout is similar in many ways, but also quite different. To resolve this duplication and make the code easier to read, I am introducing a struct to keep track of the 2D block load parameters. The existing code in LoadStoreOpToLLVM.cpp will permute the struct instead of directly modifying the MLIR values. This will make the code easier to read, reduce duplication further between the DPAS layout and DotDpas layout, and allow us to easily dump debug information about the loads being generated, as the TritonGen loads are immediately lowered to SPIRV function calls and not written in any intermediate IR.
The text was updated successfully, but these errors were encountered:
The code for lowering the tensor pointer load to TritonGen 2D Block Loads takes a parameterization of the 2D block tile size and permutes it based on properties of the DPAS layout and inputs to generate a final set of 2D block loads, shuffle the outputs, and pack/unpack the LLVM registers appropriately for the subsequent users. #3000 introduces some code duplication as the 2D block load for the DPAS layout is similar in many ways, but also quite different. To resolve this duplication and make the code easier to read, I am introducing a struct to keep track of the 2D block load parameters. The existing code in
LoadStoreOpToLLVM.cpp
will permute the struct instead of directly modifying the MLIR values. This will make the code easier to read, reduce duplication further between the DPAS layout and DotDpas layout, and allow us to easily dump debug information about the loads being generated, as the TritonGen loads are immediately lowered to SPIRV function calls and not written in any intermediate IR.The text was updated successfully, but these errors were encountered: