Merge OpenAI Triton commit `6f5baf6` #2990

whitneywhtsang · 2024-12-11T00:56:39Z

This PR change the Triton base from 4d2e9e5 to 6f5baf6 (Dec 10).
Pass rate: 99.82%->99.81% (#2991)

Please do not squash and merge this PR.

This PR refactors the instruction scheduling enums. Now, it is implemented in the MLIR.

This PR implements a specialized codegen for `tt.gather` when it satisfies the conditions of being "warp local": it is possible to compute the output tensor without data movement across warps. `isWarpLocal` is a new function that checks this condition, and places additional restrictions to simplify codegen / separate concerns from `ttg.convert_layout`. This enables `tt.gather` to generate better code when the layout is suitable. In a subsequent PR, a special pattern will be added to generate optimized layouts for `tt.gather` when possible/profitable to enable the lowering.

### Commits in this PR 1. [Pipeliner] Multi-buffer TMA descriptors 2. Add tests for pipelined descriptor creation 3. Be more conservative about number of TMA buffers to allocate 4. Update golden samples 5. Use correct modulus for tma updates

@lezcano

@lezcano pointed out in another PR that the order is confusing because typically we list the lane ID, warp ID, and blockID in this order.

The AMD runner persists changes to the file system between jobs, so the caches need to be manually cleaned up. Closes #5384

This relands triton-lang/triton#5392 to enable new arch target since backend support has been added--it doesn't depend on the reverted LLVM upgrade in triton-lang/triton#5341; basic necessary enablement is already included in the current llvm version we're using.

Enable the TRITON_KERNEL_OVERRIDE feature to work on AMD assembly and binary. Currently, for the backends, it only works on Nvidia `ptx` and `cubin`. --------- Co-authored-by: Yuanwei Fang <[email protected]>

ravil-mobile and others added 6 commits December 9, 2024 22:37

[AMD] NFC: Use TableGen enum to define schedule variants (#5376)

73ba8b6

This PR refactors the instruction scheduling enums. Now, it is implemented in the MLIR.

[Backend] Reorder return values in emitHardwareTuple (NFC) (#5390)

f5d541c

@lezcano pointed out in another PR that the order is confusing because typically we list the lane ID, warp ID, and blockID in this order.

[CI] Cleanup AMD worker at the end of a job (#5391)

c646244

The AMD runner persists changes to the file system between jobs, so the caches need to be manually cleaned up. Closes #5384

whitneywhtsang requested a review from pbchekin December 11, 2024 00:56

whitneywhtsang self-assigned this Dec 11, 2024

whitneywhtsang mentioned this pull request Dec 11, 2024

Merge OpenAI Triton till Dec 13rd #2879

Closed

Allow TRITON_KERNEL_OVERRIDE on .amdgcn and .hsaco files (#5394)

6f5baf6

Enable the TRITON_KERNEL_OVERRIDE feature to work on AMD assembly and binary. Currently, for the backends, it only works on Nvidia `ptx` and `cubin`. --------- Co-authored-by: Yuanwei Fang <[email protected]>

pbchekin approved these changes Dec 11, 2024

View reviewed changes

Merge commit '6f5baf6801b44e51b7ba8eedaa619e39c912bef6'

e302ae6

whitneywhtsang force-pushed the whitneywhtsang/merge branch from 0b33ad4 to e302ae6 Compare December 11, 2024 02:30

whitneywhtsang mentioned this pull request Dec 11, 2024

[UT] Fix test_gather_warp_shuffle #2991

Open

whitneywhtsang marked this pull request as ready for review December 11, 2024 03:29

whitneywhtsang changed the title ~~Merge OpenAI Triton commit f257479~~ Merge OpenAI Triton commit 6f5baf6 Dec 11, 2024

whitneywhtsang merged commit e302ae6 into main Dec 11, 2024
5 checks passed

whitneywhtsang deleted the whitneywhtsang/merge branch December 11, 2024 04:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge OpenAI Triton commit `6f5baf6` #2990

Merge OpenAI Triton commit `6f5baf6` #2990

whitneywhtsang commented Dec 11, 2024 •

edited

Loading

Merge OpenAI Triton commit 6f5baf6 #2990

Merge OpenAI Triton commit 6f5baf6 #2990

Conversation

whitneywhtsang commented Dec 11, 2024 • edited Loading

Merge OpenAI Triton commit `6f5baf6` #2990

Merge OpenAI Triton commit `6f5baf6` #2990

whitneywhtsang commented Dec 11, 2024 •

edited

Loading