-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][Doc] Add spec to get device image backend content #14811
base: sycl
Are you sure you want to change the base?
Conversation
Add a proposed specification for an extension that returns the content of a device image.
|
||
:ref2: ../proposed/sycl_ext_oneapi_free_function_kernels.asciidoc | ||
|
||
A kernel bundle can contain multiple device images with different |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to get information about device for a given device image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not directly. You can indirectly ask if a device image is compatible with a device via device_image::has_kernel(kernel_id, device)
. There are a couple reasons why it's hard to provide a direct query:
-
For device images that are not in
executable
state, the device image could be compatible with many different devices. For example, a SPIR-V module can be compiled for many different devices, including future devices that don't even exist yet. -
Even for
executable
device images, the image could be compatible with many different devices. For example, I think this is the case for Nvidia CUBIN modules.
I think we could add a query like device_image::is_compatible(device)
if that would help. This would be similar to device_image::has_kernel(kernel_id, device)
, except it asks about the device image a whole, rather than a specific kernel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there is device_image::has_kernel(kernel_id, device), then I think it is fine.
sycl::get_kernel_bundle<sycl::bundle_state::executable>(ctxt, {iota}); | ||
|
||
std::vector<std::byte> bytes; | ||
for (auto& img: bundle) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my education, are there any cases where there will be more than 1 image returned for this case? Except the multi-device case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good question! For DPC++, this will be the case (at least for now).
@AerialMantis proposed the initial draft for the kernel bundle APIs, and he originally thought there could be multiple device images for a single device. I think the idea was that the compiler could create multiple versions that are optimized for different scenarios. I think this was just a hypothetical use case, though. DPC++ does not currently do this. If we did add this in the future, we would need some sort of query to distinguish between them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait. Are you asking if a kernel_bundle
could contain more than one device image when the system has just one device? Yes, that can definitely happen. Consider, for example, a case where you ask for a bundle that contains two kernels:
sycl::kernel_id foo = syclexp::get_kernel_id<foo>();
sycl::kernel_id bar = syclexp::get_kernel_id<bar>();
auto exe_bndl =
sycl::get_kernel_bundle<sycl::bundle_state::executable>(ctxt, {foo, bar});
There is no guarantee that foo
and bar
are in the same device image, so the kernel bundle might contain two device images in this case. The compiler decides how to pack kernels together into device images depending on the compilation flags (e.g. -fsycl-device-code-split
) and also some internal algorithms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
sycl::get_kernel_bundle<sycl::bundle_state::executable>(ctxt, {iota}); | ||
|
||
std::vector<std::byte> bytes; | ||
for (auto& img: bundle) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait. Are you asking if a kernel_bundle
could contain more than one device image when the system has just one device? Yes, that can definitely happen. Consider, for example, a case where you ask for a bundle that contains two kernels:
sycl::kernel_id foo = syclexp::get_kernel_id<foo>();
sycl::kernel_id bar = syclexp::get_kernel_id<bar>();
auto exe_bndl =
sycl::get_kernel_bundle<sycl::bundle_state::executable>(ctxt, {foo, bar});
There is no guarantee that foo
and bar
are in the same device image, so the kernel bundle might contain two device images in this case. The compiler decides how to pack kernels together into device images depending on the compilation flags (e.g. -fsycl-device-code-split
) and also some internal algorithms.
* `bundle_state::input`: The device image content is a PTX module representing | ||
one or more kernels. | ||
|
||
* `bundle_state::object`: ??? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AerialMantis: Can you help with this part? What is the device image content for an object
state bundle on CUDA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no meaningful state for the cuda and HIP backend, they each ignore transition to this state and jump to the last.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds the same as as with Level Zero. Can we say:
The same format as
bundle_state::input
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to restrict this extension to only bundle_state::executable
, so I think the issue here is now moot.
A couple changes to simplify and clarify this extension: * Rename the functions to make it clear they get the "backend" content of the device image. This will avoid confusion with a new member function `kernel_bundle::ext_oneapi_get_content` that we plan to add in a separate extension. * Constrain the functions, so they are only available when the kernel bundle is `bundle_state::executable`. This is the only case we need now, and it avoids the need to define what is returned in the other states.
@Pennycook I added you as a reviewer since you volunteered in chat to look at this. Thanks! |
sycl/doc/extensions/proposed/sycl_ext_oneapi_device_image_backend_content.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_device_image_backend_content.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_device_image_backend_content.asciidoc
Show resolved
Hide resolved
---- | ||
!==== | ||
|
||
Available only when the compiler is {cpp}20 or higher. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should drop this, and rely on the ifdef
in the synopsis. This is really saying "Available only when the library provides an implementation of std::span
", and that's implied by usage of the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to agree on standard wording for cases like this because I think it will come up more and more.
I feel like we should say somewhere that this API is tied to C++20. How do you feel about the wording I added in d288119?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have said this is good enough, but if you'd like to settle on standard wording I'd like to talk about it a bit more.
I think it would be more accurate to say that the implementation defines the macro. ISO C++ says the implementation only defines the macro if you include <version>
or <span>
.
Given the above, do you want to say anything about who is responsible for including <span>
somewhere? It probably shouldn't be part of the function synopsis, but we should make it clear that if this extension is implemented then sycl.hpp
is expected to include <span>
if it's available (see __has_include
), and that this isn't a user's responsibility. For the actual SYCL specification, I imagine we'd have a single dedicated section to explain how compilation with different versions of C++ works, and we could cover this there instead of in each extension.
Finally, I think we could improve the formatting. We should add a title to the paragraph (like we do with Constraints, etc) and consider rephrasing the "Available only when" (to further differentiate from constraints). How about something like:
Required C++ Features: The implementation must define the
__cpp_lib_span
feature-test macro (which is defined in C++20 and higher).
I deliberately put "C++" in the title in an attempt to be future-proof. Even if ISO C++ decided to introduce a similar concept with the same name, it would probably just be called "Required Features" (since the C++ would be implied).
sycl/doc/extensions/proposed/sycl_ext_oneapi_device_image_backend_content.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_device_image_backend_content.asciidoc
Show resolved
Hide resolved
Co-authored-by: John Pennycook <[email protected]>
Co-authored-by: John Pennycook <[email protected]>
Two changes that make it easier to use `device_image` by code that does not also have the `kernel_bundle` object: * Add an API to `device_image` that returns the backend. * Change the lifetime of the content view to be the lifetime of the `device_image` object instead of the containing `kernel_bundle` object.
Add a proposed specification for an extension that returns the backend content of a device image.