-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal] Deprecate SYCL Offset Accessors #569
Comments
I vote for 2/; simplifying the accessor is a good thing, IMO. I guess if one day we mandate that buffers be backed by USM, people will be able to writte |
Great, @hdelan. Agree with @TApplencourt. Anyway, any idea why this was included in the spec in the first place? What was the intended usage justifying the potential overhead? |
In theory, I can create an accesors on only |
Yes, that's what I mean. If that's the case, a sub-buffer sounds like a better approach due to the issue Hugh points out here. |
Not necessarily. Couldn't the implementation pass a base pointer as the kernel argument, where the base pointer already had the offset added? Things probably get more complicated for multi-dimensional accessors, though. |
Sub-buffers are more limited though, as they can only be created on a contiguous region of the parent buffer. One application that comes to mind for ranged accessors: |
This is a good point. In theory this should work, although DPC++ would need to change some internals to make this happen. Thanks for suggestions and discussion. I'll keep this open for the moment in case anyone else wants to chime in, but I will make a ticket on our side to change the impl so only already offset accessor ptrs are passed to device. |
AdaptiveCpp has recently solved this issue on the compiler side; however we already have an extension that basically does your point 1. This is possible in a fairly seamless way when CTAD is used: https://github.com/AdaptiveCpp/AdaptiveCpp/blob/develop/doc/accessor-variants.md |
The SYCL spec allows for an extra offset argument to be used when constructing an accessor.
This seems like a good idea to allow the programmer to offset into a mem arg, such that
acc[0]
refers not to the base of some native allocation.However, this means that when calculating
acc[arbitrary_idx]
, internally the SYCL implementation must addarbitrary_idx + accessOffset
in order to generate the required index.In 99% (guess) of cases,
accessOffset
is zero. However most compilers (DPC++, for one) lack the host-device optimizations necessary in order to propagate the zero value to the device compilation pass. This means that the access offset becomes a kernel argument, and the zero values are loaded at runtime and then added to each accessor base pointer, at least once at accessor initialization. This results in more kernel args, clock cycles wasted adding zero to ptrs, and increased register usage within the kernel.As a quick demonstration:
A simple SYCL kernel just doing
acc[0] = 1
:vs a simple SYCL kernel doing
acc.get_multi_ptr()[0]
Note that when
get_multi_ptr()
is called, the compiler is able to see that theaccessOffset
member is not used in the kernel and it can remove a kernel argument.This can be solved in two ways in the SYCL specification:
accessor
class, which would describe whether the accessor was offset or not. This would allow the use ofif constexpr
in the accessor setup which would use the accessOffset or not.The text was updated successfully, but these errors were encountered: