Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Lift restrictions on free-function kernels when compiling at runtime #15892

Merged

Conversation

sommerlukas
Copy link
Contributor

In order to be able to generate correct and complete information for the integration header, the current implementation places some restrictions on free-function kernels and their parameters. For example, parameters of free function kernels need to be forward-declarable.

However, when compiling SYCL code at runtime (RTC), e.g., through the kernel_compiler extension, host code is typically not relevant, so the integration header is not as relevant and some restrictions on free-function kernels can be lifted.

This PR introduces a -fsycl-rtc-mode flag (and it's negative equivalent) to deactivate some restrictions on free-function kernels and omit some information for free-function kernels from the integration header.

Add a flag to compile in "RTC mode", in which we do not care about
complete information in the integration header.

If the RTC mode is enabled, omit free-function kernels fom the
integration header to lift restrictions regarding the ability to forward
declare free-function kernel parameters.

Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Copy link
Contributor

@Naghasan Naghasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

host code is typically not relevant, so the integration header is not as relevant and some restrictions on free-function kernels can be lifted.

Can you link the extension spec which explains this? What do you mean by not as relevant? Is it ever relevant in this mode? Should we be removing integration header functionality entirely in this mode?

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved
@sommerlukas
Copy link
Contributor Author

host code is typically not relevant, so the integration header is not as relevant and some restrictions on free-function kernels can be lifted.

Can you link the extension spec which explains this? What do you mean by not as relevant? Is it ever relevant in this mode?

@elizabethandrews Thanks for you feedback!

The extension specification for the kernel_compiler with SYCL as source language is in #11985.

The extension defines an interface to compile kernels through the SYCL kernel bundle interface, but does not allow to compile additional host code.

The current implementation is in kernel_compiler_sycl.cpp. As -fsycl-device-only/-c and -fsycl-dump-device-code in combination do not produce finalized SPIR-V, the current implementation adds a dummy main for the host code and performs host and device compilation. The host code is however never used after the compilation, only the SPIR-V for the device is needed (see line 138,

Should we be removing integration header functionality entirely in this mode?

This would also be an option, yes. For RTC, it would be useful to have a mode/flag to only compile device code, yielding finalized SPIR-V. No integration header would be needed in this mode and restrictions for free-function kernels would not need to be in place.

Removing the integration header entirely seemed quite invasive to me, that's why I chose this less-invasive route to address an immediate need for one of our projects. But I'm happy to have a conversation how we can improve things further.

Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
@elizabethandrews
Copy link
Contributor

Should we be removing integration header functionality entirely in this mode?

This would also be an option, yes. For RTC, it would be useful to have a mode/flag to only compile device code, yielding finalized SPIR-V. No integration header would be needed in this mode and restrictions for free-function kernels would not need to be in place.

I assume you mean restrictions added to free-function kernels - due to the integration header requirement - will no longer be needed. Other restrictions will still be required.

Removing the integration header entirely seemed quite invasive to me, that's why I chose this less-invasive route to address an immediate need for one of our projects. But I'm happy to have a conversation how we can improve things further.

I don't think it is invasive if it is not required. If RTC does not require integration header, I would lean towards removing the functionality for this mode (not necessarily in this PR) but I would like @premanandrao and @tahonermann to weigh in here.

@sommerlukas
Copy link
Contributor Author

Should we be removing integration header functionality entirely in this mode?

This would also be an option, yes. For RTC, it would be useful to have a mode/flag to only compile device code, yielding finalized SPIR-V. No integration header would be needed in this mode and restrictions for free-function kernels would not need to be in place.

I assume you mean restrictions added to free-function kernels - due to the integration header requirement - will no longer be needed. Other restrictions will still be required.

Yes, this PR was specifically for restrictions stemming from the integration header.

Removing the integration header entirely seemed quite invasive to me, that's why I chose this less-invasive route to address an immediate need for one of our projects. But I'm happy to have a conversation how we can improve things further.

I don't think it is invasive if it is not required. If RTC does not require integration header, I would lean towards removing the functionality for this mode (not necessarily in this PR) but I would like @premanandrao and @tahonermann to weigh in here.

IIUC, there is also some ongoing/planned effort to replace the integration header with another mechanism. Potentially, we could land the change in this PR as a workaround for now and remove the hidden flag again once the integration header mechanism has been replaced.

@elizabethandrews
Copy link
Contributor

Should we be removing integration header functionality entirely in this mode?

This would also be an option, yes. For RTC, it would be useful to have a mode/flag to only compile device code, yielding finalized SPIR-V. No integration header would be needed in this mode and restrictions for free-function kernels would not need to be in place.

I assume you mean restrictions added to free-function kernels - due to the integration header requirement - will no longer be needed. Other restrictions will still be required.

Yes, this PR was specifically for restrictions stemming from the integration header.

Removing the integration header entirely seemed quite invasive to me, that's why I chose this less-invasive route to address an immediate need for one of our projects. But I'm happy to have a conversation how we can improve things further.

I don't think it is invasive if it is not required. If RTC does not require integration header, I would lean towards removing the functionality for this mode (not necessarily in this PR) but I would like @premanandrao and @tahonermann to weigh in here.

IIUC, there is also some ongoing/planned effort to replace the integration header with another mechanism. Potentially, we could land the change in this PR as a workaround for now and remove the hidden flag again once the integration header mechanism has been replaced.

I'm not opposed to landing this PR for now but I am not sure if we should delay removing integration header functionality in this mode till we implement the alternate functionality for integration header. We are unnecessarily generating code and files in a case where it is not required now. I assume it is better to stop doing that sooner rather than later. Off the top of my head, I do not know what this entails. We may just need to avoid calling the header handlers, but I am not sure. I suspect there is code sprinkled throughout we may need to conditionalize. At a certain point, sure it might make sense to not do this now but without some sort of cost-benefit-timeline analysis, I am not sure if we should just dismiss this for now.

@tahonermann
Copy link
Contributor

IIUC, there is also some ongoing/planned effort to replace the integration header with another mechanism.

That mechanism will only work with SYCL-aware host compilers. The integration header approach will still be needed to support third party host compilers (if such support is required).

@gmlueck
Copy link
Contributor

gmlueck commented Oct 31, 2024

It seems like this option is only being added as a short-term hack because the kernel compiler implementation currently invokes the host compilation pass even though there is no need to generate host code. Once that is fixed, I think we will no longer need this option.

Given that it is just a short-term hack, is this the right way to define the option? Should we instead pass the option via -Xclang? Would that make it easier for us to remove the option later?

@elizabethandrews
Copy link
Contributor

It seems like this option is only being added as a short-term hack because the kernel compiler implementation currently invokes the host compilation pass even though there is no need to generate host code. Once that is fixed, I think we will no longer need this option.

Integration header generation happens during device compilation. So these diagnostics will fire even if host compilation is disabled. So I do not think this option is temporary unless we come up with another way to tell the compiler that we do not need to generate the integration header.

Given that it is just a short-term hack, is this the right way to define the option? Should we instead pass the option via -Xclang? Would that make it easier for us to remove the option later?

From an implementation perspective I don't think it matters.

@gmlueck
Copy link
Contributor

gmlueck commented Oct 31, 2024

Integration header generation happens during device compilation. So these diagnostics will fire even if host compilation is disabled. So I do not think this option is temporary unless we come up with another way to tell the compiler that we do not need to generate the integration header.

I'm pretty sure we will not expose the option to the user with this name. For example, if we use "-fsyclbin" to compile SYCLBIN files as I proposed on Wednesday, we would tie this feature to the "-fsyclbin" option. However, even that option name is not decided.

From an implementation perspective I don't think it matters.

I meant more from an API stability point of view. If we document "-fsycl-rtc-mode" are we committing to support it forever? I think we want to avoid any commitment to support this option.

@sommerlukas
Copy link
Contributor Author

From an implementation perspective I don't think it matters.

I meant more from an API stability point of view. If we document "-fsycl-rtc-mode" are we committing to support it forever? I think we want to avoid any commitment to support this option.

The option is hidden, i.e., it does not appear in clang++ --help and I intentionally did not add it to the user manual, so we don't really expose it ATM. IMHO, this would not imply a commitment to support this option in the future.

Signed-off-by: Lukas Sommer <[email protected]>
@elizabethandrews
Copy link
Contributor

I'm pretty sure we will not expose the option to the user with this name. For example, if we use "-fsyclbin" to compile SYCLBIN files as I proposed on Wednesday, we would tie this feature to the "-fsyclbin" option. However, even that option name is not decided.

I understand. My point was that conditionalizing the diagnostics (irrespective of what the option is called) is not temporary. This is a permanent change for this mode. And while it is ok as a quick workaround, eventually we need to decouple the generation of the integration from this mode.

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving this for now but please file an issue to remove remove the generation of Integration header in this mode.

@sommerlukas
Copy link
Contributor Author

I'm approving this for now but please file an issue to remove remove the generation of Integration header in this mode.

Thanks, I've filed #15996.

@sommerlukas sommerlukas merged commit 5208484 into intel:sycl Nov 6, 2024
13 checks passed
@sommerlukas sommerlukas deleted the skip-integration-header-ff-kernels branch November 6, 2024 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants