[alpaka] Add support for the SYCL back-end #407

AuroraPerego · 2023-09-04T17:31:45Z

With #1845 and other following contributions, alpaka now supports also SYCL/oneAPI as a back-end to target CPUs, Intel GPUs and FPGAs.
This PR propagates the support for SYCL also in pixeltrack.
Some comments:

the new back-ends available are --syclcpu and --syclgpu
in the Makefile, the linking of the libraries and of the final target must be done with the Intel compiler (when available)
the develop branch of alpaka is cloned at a newer commit
there are some workarounds for bugs/not supported features in sycl (all_of_group / any_of_group on CPU, SYCL kernels do not support zero-lenght arrays, device global variables are not supported in the SYCL back-end yet)
common interface for math functions (SYCL wants math function in the sycl namespace)
a trait for the warp size has been added to force it to 32 when needed

In addition, the vendor-specific RNG support in alpaka has been disabled.

additional sources are necessary to use gdb-oneapi the same is true for the other tools (advisor, inspector, vtune...) but since they are useless there is no point in sourcing them as well

The application is compiled once for the CPU(s) and once for the Intel GPU(s). The flags for AOT compilation have been added and icpx is used as the default compiler for the SYCL backend. The linking step is performed as well with icpx when the SYCL backend is enabled.

commit 819974ddc5b2eb4b33e709bd317701793cdb7d15 Author: Jan Stephan <[email protected]> Date: Thu Aug 3 14:20:59 2023 +0200 Always use std::size_t for CUDA pitch calculations

Changed the order to make the ONEAPI compiler happy

For the other backends it is mapped to ALPAKA_STATIC_ACC_MEM_CONSTANT, but global variables are not supported yet in the SYCL backend. However, in this case a `constexpr` is enough to obtain the same result

- added the allocator policy: Caching or Synchronous - added allocCachedBuf: TODO implement pitch in SYCL - fixed cout because SYCL events in alpaka are not shared pointers - do not cache SYCL events - Changed size of CountersOnly to 1 because SYCL kernels do not support zero-lenght arrays - implemented HostOnlyTask using `alpaka/core/CallbackThread` - adapt `prefixScan` and `radixSort` to SYCL: work around the function pointer not supported in SYCl kernels - implemented the work division - add the `--syclcpu` and `--syclgpu` options for the backends

Math functions are defined in the `namespace math` and are taken from the `sycl namespace` for the SYCL backend, from the global namespace for the CUDA and HIP backends and from the `std namespace` in every other case

added the ifdef for SYCL

Add trait to set the warp size to 32 for the kernels that requires it. Implemented in alpaka only for the SYCL backend at the moment, does nothing for the other backends

There is a bug with `any_of_group` / `all_of_group` in the OpenCL runtime that can be worked around setting the sub-group size equal to the block size. The bug has been solved in the latest runtimes, but the application hangs with these..

fwyzard and others added 13 commits August 28, 2023 02:46

Fix warnings about unused variables

552f0e0

Disable vendor-specific RNG support

54ee48f

[sycl] add path to debugger in Makefile

2be85e8

additional sources are necessary to use gdb-oneapi the same is true for the other tools (advisor, inspector, vtune...) but since they are useless there is no point in sourcing them as well

[sycl] Install Intel OpenCL CPU runtime 2022.14.8.0.04 as an external

e8fdc61

[alpaka] checkout a newer alpaka version

5994097

commit 819974ddc5b2eb4b33e709bd317701793cdb7d15 Author: Jan Stephan <[email protected]> Date: Thu Aug 3 14:20:59 2023 +0200 Always use std::size_t for CUDA pitch calculations

[alpaka] put static after ALPAKA_FN_INLINE

4452287

Changed the order to make the ONEAPI compiler happy

[alpaka] define CONSTANT_VAR as constexpr for SYCL

fe69537

For the other backends it is mapped to ALPAKA_STATIC_ACC_MEM_CONSTANT, but global variables are not supported yet in the SYCL backend. However, in this case a `constexpr` is enough to obtain the same result

[alpaka] define a common interface for math functions

9377ce5

Math functions are defined in the `namespace math` and are taken from the `sycl namespace` for the SYCL backend, from the global namespace for the CUDA and HIP backends and from the `std namespace` in every other case

[alpaka] add support for SYCL in the plugins and in the tests

37aa0cb

added the ifdef for SYCL

[alpaka] add warpSize trait

1013547

Add trait to set the warp size to 32 for the kernels that requires it. Implemented in alpaka only for the SYCL backend at the moment, does nothing for the other backends

[alpaka] fix for the SYCL CPU backend

1774c68

There is a bug with `any_of_group` / `all_of_group` in the OpenCL runtime that can be worked around setting the sub-group size equal to the block size. The bug has been solved in the latest runtimes, but the application hangs with these..

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[alpaka] Add support for the SYCL back-end #407

[alpaka] Add support for the SYCL back-end #407

AuroraPerego commented Sep 4, 2023

[alpaka] Add support for the SYCL back-end #407

Are you sure you want to change the base?

[alpaka] Add support for the SYCL back-end #407

Conversation

AuroraPerego commented Sep 4, 2023