-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transformations: Split varith into neighbour and own data across csl_stencil regions #3307
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3307 +/- ##
==========================================
- Coverage 90.06% 90.04% -0.02%
==========================================
Files 446 446
Lines 56382 56401 +19
Branches 5409 5417 +8
==========================================
+ Hits 50779 50788 +9
- Misses 4177 4183 +6
- Partials 1426 1430 +4 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to go 👍
…stencil regions (xdslproject#3307) Background: The `csl_stencil.apply` op does communicate-and-compute on a given stencil buffer. It holds two regions, one for processing chunks of neighbour data (of this one buffer only), and one region for processing everything else after the exchange is done. The `convert-stencil-to-csl-stencil` pass splits the computation of the `stencil.apply` op across these two regions. The split was done in two steps, first re-ordering arith ops in the `RestructureSymmetricReductionPattern`, and then calling the `get_ops_split` function on the re-shuffled arith ops. Intuitively, the re-order pass would identify chained reductions (`arith.addf`, `arith.mulf`) and restructure them such that all neighbour data which should end up in the first region is consumed first, and the chained arith ops become easily splittable. This PR replaces this logic by converting arith to varith, splitting the varith op into neighbour/other data in the `SplitVarithOpPattern` rewrite, and then proceeding with `get_ops_split` and everything else as before. At the end, varith is converted back to arith. Minor improvements: * Constants are now always duplicated and appear on both regions, which `dce` can clean up --------- Co-authored-by: n-io <[email protected]>
Background: The
csl_stencil.apply
op does communicate-and-compute on a given stencil buffer. It holds two regions, one for processing chunks of neighbour data (of this one buffer only), and one region for processing everything else after the exchange is done. Theconvert-stencil-to-csl-stencil
pass splits the computation of thestencil.apply
op across these two regions.The split was done in two steps, first re-ordering arith ops in the
RestructureSymmetricReductionPattern
, and then calling theget_ops_split
function on the re-shuffled arith ops. Intuitively, the re-order pass would identify chained reductions (arith.addf
,arith.mulf
) and restructure them such that all neighbour data which should end up in the first region is consumed first, and the chained arith ops become easily splittable.This PR replaces this logic by converting arith to varith, splitting the varith op into neighbour/other data in the
SplitVarithOpPattern
rewrite, and then proceeding withget_ops_split
and everything else as before. At the end, varith is converted back to arith.Minor improvements:
dce
can clean upNote:
This PR currently has a fix from transformations: (arith-to-varith) Support more cases #3330 merged, will revert changes toconvert-arith-to-varith.mlir
andvarith_transformations.py
as they are not intended to be merged from this PR