Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[XPU][TritonGPUToLLVM] Avoid bank conflicts in sub-group transposes #2769

Merged

Conversation

victor-eds
Copy link
Contributor

  • Store the whole matrix using SIMD block stores for each row leaving
    a single garbage item at the end of the row so each row has
    sub_group_size + 1 elements
  • Load each row with vector loads

By introducing this garbage item at the end of each row, we ensure matrix
loading avoid bank conflicts as the offset between the position loaded by
work-item i and i+j is N * (sub_group_size + 1) (assuming sub_group_size
banks).

Closes #2751

@chengjunlu
Copy link
Contributor

In my mind, the bank number of SLM is 65 on PVC. I think we need to make sure the bank number of the SLM.

- Store the whole matrix using SIMD block stores for each row leaving
  a single garbage item at the end of the row so each row has
  `sub_group_size + 1` elements
- Load each row with vector loads

By introducing this garbage item at the end of each row, we ensure matrix
loading avoid bank conflicts as the offset between the position loaded by
work-item `i` and `i+j` is `N * (sub_group_size + 1)` (assuming `sub_group_size`
banks).

Signed-off-by: victor-eds <[email protected]>
@victor-eds victor-eds force-pushed the reduction-opt-avoid-bank-conflicts branch from a1c5725 to b6cd04c Compare November 20, 2024 11:52
@victor-eds
Copy link
Contributor Author

In my mind, the bank number of SLM is 65 on PVC. I think we need to make sure the bank number of the SLM.

PVC has 64 8B banks. This PR indeed helps avoid bank conflicts. It's also true we could make this more optimal, but I'd go with this for now and fine-tune in a followup PR. WDYT?

Copy link
Contributor

@whitneywhtsang whitneywhtsang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any performance impact?

@chengjunlu
Copy link
Contributor

In my mind, the bank number of SLM is 65 on PVC. I think we need to make sure the bank number of the SLM.

PVC has 64 8B banks. This PR indeed helps avoid bank conflicts. It's also true we could make this more optimal, but I'd go with this for now and fine-tune in a followup PR. WDYT?

LGTM.

@victor-eds
Copy link
Contributor Author

Any performance impact?

Good impact on the victor/perf-attn. No effect on regular path as this is a pattern introduced by reduction opt.

@victor-eds victor-eds merged commit af1f272 into intel:main Nov 21, 2024
5 checks passed
@victor-eds victor-eds deleted the reduction-opt-avoid-bank-conflicts branch November 21, 2024 11:15
@victor-eds
Copy link
Contributor Author

@chengjunlu I created #2797

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid bank conflicts on sub-group transposes
4 participants