forked from Rust-GCC/gccrs
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This patch adds support to vectorize sum of abslolute differences (SA…
…D_EXPR) using SVE. Given this input code: int sum_abs (uint8_t *restrict x, uint8_t *restrict y, int n) { int sum = 0; for (int i = 0; i < n; i++) { sum += __builtin_abs (x[i] - y[i]); } return sum; } The resulting SVE code is: 0000000000000000 <sum_abs>: 0: 7100005f cmp w2, #0x0 4: 5400026d b.le 50 <sum_abs+0x50> 8: d2800003 mov x3, #0x0 // #0 c: 93407c42 sxtw x2, w2 10: 2538c002 mov z2.b, #0 14: 25221fe0 whilelo p0.b, xzr, x2 18: 2538c023 mov z3.b, #1 1c: 2518e3e1 ptrue p1.b 20: a4034000 ld1b {z0.b}, p0/z, [x0, x3] 24: a4034021 ld1b {z1.b}, p0/z, [x1, x3] 28: 0430e3e3 incb x3 2c: 0520c021 sel z1.b, p0, z1.b, z0.b 30: 25221c60 whilelo p0.b, x3, x2 34: 040d0420 uabd z0.b, p1/m, z0.b, z1.b 38: 44830402 udot z2.s, z0.b, z3.b 3c: 54ffff21 b.ne 20 <sum_abs+0x20> // b.any 40: 2598e3e0 ptrue p0.s 44: 04812042 uaddv d2, p0, z2.s 48: 1e260040 fmov w0, s2 4c: d65f03c0 ret 50: 1e2703e2 fmov s2, wzr 54: 1e260040 fmov w0, s2 58: d65f03c0 ret Notice how udot is used inside a fully masked loop. gcc/Changelog: 2019-05-07 Alejandro Martinez <[email protected]> * config/aarch64/aarch64-sve.md (<su>abd<mode>_3): New define_expand. (aarch64_<su>abd<mode>_3): Likewise. (*aarch64_<su>abd<mode>_3): New define_insn. (<sur>sad<vsi2qi>): New define_expand. * config/aarch64/iterators.md: Added MAX_OPP attribute. * tree-vect-loop.c (use_mask_by_cond_expr_p): Add SAD_EXPR. (build_vect_cond_expr): Likewise. gcc/testsuite/Changelog: 2019-05-07 Alejandro Martinez <[email protected]> * gcc.target/aarch64/sve/sad_1.c: New test for sum of absolute differences. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@270975 138bc75d-0d04-0410-961f-82ee72b054a4
- Loading branch information
alejandro
committed
May 7, 2019
1 parent
b16ca97
commit 2cbc1ad
Showing
6 changed files
with
119 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,13 @@ | ||
2019-05-07 Alejandro Martinez <[email protected]> | ||
|
||
* config/aarch64/aarch64-sve.md (<su>abd<mode>_3): New define_expand. | ||
(aarch64_<su>abd<mode>_3): Likewise. | ||
(*aarch64_<su>abd<mode>_3): New define_insn. | ||
(<sur>sad<vsi2qi>): New define_expand. | ||
* config/aarch64/iterators.md: Added MAX_OPP attribute. | ||
* tree-vect-loop.c (use_mask_by_cond_expr_p): Add SAD_EXPR. | ||
(build_vect_cond_expr): Likewise. | ||
|
||
2019-05-07 Uroš Bizjak <[email protected]> | ||
|
||
* cfgexpand.c (asm_clobber_reg_is_valid): Reject | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,8 @@ | ||
2019-05-07 Alejandro Martinez <[email protected]> | ||
|
||
* gcc.target/aarch64/sve/sad_1.c: New test for sum of absolute | ||
differences. | ||
|
||
2019-05-07 Uroš Bizjak <[email protected]> | ||
|
||
* gcc.target/i386/asm-7.c: New test. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
/* { dg-do compile } */ | ||
/* { dg-options "-O2 -ftree-vectorize" } */ | ||
|
||
#include <stdint.h> | ||
|
||
#define DEF_SAD(TYPE1, TYPE2) \ | ||
TYPE1 __attribute__ ((noinline, noclone)) \ | ||
sum_abs_##TYPE1##_##TYPE2 (TYPE2 *restrict x, TYPE2 *restrict y, int n) \ | ||
{ \ | ||
TYPE1 sum = 0; \ | ||
for (int i = 0; i < n; i++) \ | ||
{ \ | ||
sum += __builtin_abs (x[i] - y[i]); \ | ||
} \ | ||
return sum; \ | ||
} | ||
|
||
DEF_SAD(int32_t, uint8_t) | ||
DEF_SAD(int32_t, int8_t) | ||
DEF_SAD(int64_t, uint16_t) | ||
DEF_SAD(int64_t, int16_t) | ||
|
||
/* { dg-final { scan-assembler-times {\tuabd\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */ | ||
/* { dg-final { scan-assembler-times {\tsabd\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */ | ||
/* { dg-final { scan-assembler-times {\tudot\tz[0-9]+\.s, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */ | ||
/* { dg-final { scan-assembler-times {\tuabd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ | ||
/* { dg-final { scan-assembler-times {\tsabd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ | ||
/* { dg-final { scan-assembler-times {\tudot\tz[0-9]+\.d, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters