Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Fix uncaught exceptions and null dereference #15173

Merged
merged 12 commits into from
Sep 4, 2024
25 changes: 17 additions & 8 deletions sycl/source/detail/device_image_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -95,10 +95,14 @@ class device_image_impl {

// Otherwise, if the device candidate is a sub-device it is also valid if
// its parent is valid.
if (!getSyclObjImpl(DeviceCand)->isRootDevice())
return has_kernel(KernelIDCand,
DeviceCand.get_info<info::device::parent_device>());

if (!getSyclObjImpl(DeviceCand)->isRootDevice()) {
try {
return has_kernel(KernelIDCand,
DeviceCand.get_info<info::device::parent_device>());
} catch (std::exception &e) {
dm-vodopyanov marked this conversation as resolved.
Show resolved Hide resolved
__SYCL_REPORT_EXCEPTION_TO_STREAM("exception in has_kernel", e);
bso-intel marked this conversation as resolved.
Show resolved Hide resolved
}
}
return false;
}

Expand Down Expand Up @@ -270,10 +274,15 @@ class device_image_impl {
// TODO consider changing the lifetime of device_image_impl instead
ur_buffer_properties_t Properties = {UR_STRUCTURE_TYPE_BUFFER_PROPERTIES,
nullptr, MSpecConstsBlob.data()};
memBufferCreateHelper(
Plugin, detail::getSyclObjImpl(MContext)->getHandleRef(),
UR_MEM_FLAG_READ_WRITE | UR_MEM_FLAG_ALLOC_COPY_HOST_POINTER,
MSpecConstsBlob.size(), &MSpecConstsBuffer, &Properties);
try {
memBufferCreateHelper(
Plugin, detail::getSyclObjImpl(MContext)->getHandleRef(),
UR_MEM_FLAG_READ_WRITE | UR_MEM_FLAG_ALLOC_COPY_HOST_POINTER,
MSpecConstsBlob.size(), &MSpecConstsBuffer, &Properties);
} catch (std::exception &e) {
__SYCL_REPORT_EXCEPTION_TO_STREAM(
"exception in get_spec_const_buffer_ref", e);
bso-intel marked this conversation as resolved.
Show resolved Hide resolved
}
}
return MSpecConstsBuffer;
}
Expand Down
13 changes: 10 additions & 3 deletions sycl/source/detail/global_handler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -355,8 +355,14 @@ void shutdown_late() {
extern "C" __SYCL_EXPORT BOOL WINAPI DllMain(HINSTANCE hinstDLL,
DWORD fdwReason,
LPVOID lpReserved) {
bool PrintUrTrace =
sycl::detail::ur::trace(sycl::detail::ur::TraceLevel::TRACE_CALLS);
bool PrintUrTrace = false;
try {
bso-intel marked this conversation as resolved.
Show resolved Hide resolved
PrintUrTrace =
sycl::detail::ur::trace(sycl::detail::ur::TraceLevel::TRACE_CALLS);
} catch (std::exception &e) {
__SYCL_REPORT_EXCEPTION_TO_STREAM("exception in DllMain", e);
return FALSE;
}

// Perform actions based on the reason for calling.
switch (fdwReason) {
Expand All @@ -367,7 +373,8 @@ extern "C" __SYCL_EXPORT BOOL WINAPI DllMain(HINSTANCE hinstDLL,
#ifdef XPTI_ENABLE_INSTRUMENTATION
if (xptiTraceEnabled())
return TRUE; // When doing xpti tracing, we can't safely call shutdown.
// TODO: figure out what XPTI is doing that prevents release.
// TODO: figure out what XPTI is doing that prevents
// release.
#endif

shutdown_win();
Expand Down
7 changes: 4 additions & 3 deletions sycl/source/detail/graph_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1618,16 +1618,17 @@ void modifiable_command_graph::end_recording() {

void modifiable_command_graph::end_recording(queue &RecordingQueue) {
auto QueueImpl = sycl::detail::getSyclObjImpl(RecordingQueue);
if (QueueImpl && QueueImpl->getCommandGraph() == impl) {
if (!QueueImpl)
return;
if (QueueImpl->getCommandGraph() == impl) {
QueueImpl->setCommandGraph(nullptr);
graph_impl::WriteLock Lock(impl->MMutex);
impl->removeQueue(QueueImpl);
}
if (QueueImpl->getCommandGraph() != nullptr) {
if (QueueImpl->getCommandGraph() != nullptr)
throw sycl::exception(sycl::make_error_code(errc::invalid),
"end_recording called for a queue which is recording "
"to a different graph.");
}
}

void modifiable_command_graph::end_recording(
Expand Down
7 changes: 6 additions & 1 deletion sycl/source/detail/image_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -238,7 +238,12 @@ class image_impl final : public SYCLMemObjT {

// Returns the total number of elements in the image
size_t get_count() const { return size(); }
size_t size() const noexcept { return MRange.size(); }
size_t size() const noexcept try {
return MRange.size();
} catch (std::exception &e) {
__SYCL_REPORT_EXCEPTION_TO_STREAM("exception in size", e);
std::abort();
}
Comment on lines +241 to +246
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes like this look weird to me and I think that we may need some spec changes here to avoid them. Tagging @gmlueck here for awareness/feedback.

The only meaningful exception I would expect to see from range::size() is something like overflow_error, but at the same time we expect this type to be used on device (as a return type of various nd_item/group-related APIs) and exceptions are not supported in there.

Changing definition of the method between host and device compilation seems hacky, so I assume that the SYCL spec can only have one definition. And I think that that definition should be noexcept.

Reason behind our implementation throwing from range::size():

  • range inherits from detail::common_array to store data
  • range::size uses detail::common_array::operator[] to access that data
  • detail::common_array::operator[] contains a call to check_dimensions which throws if index is out of range [0..2], but does that only on host

I would also explore if we should turn runtime check in detail::common_array::check_dimensions into an assert. Even though values we pass there could pass directly from user (via id::operator[] for example), the SYCL spec is not clear what happens if we use an invalid dimension here. KhronosGroup/SYCL-Docs#551 is related, but there is no conclusion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change here seems unnecessary. Even without this change, if an exception is thrown in the body of size, the application will terminate. That's the semantics of the noexcept keyword. See cppreference:

Non-throwing functions are permitted to call potentially-throwing functions. Whenever an exception is thrown and the search for a handler encounters the outermost block of a non-throwing function, the function std::terminate is called

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change here seems unnecessary. Even without this change, if an exception is thrown in the body of size, the application will terminate.

My understanding is that one of our code coverage tools complains about throw in noexcept function and we are unable to ignore that by some reason. It is probably a good notification for us to review associated code to see if we can actually make it noexcept, but otherwise I don't think that its the end of the world. All those catch block should almost always be unreachable as I understand it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've submitted KhronosGroup/SYCL-Docs#626 to see if we want to mark all methods of range/id/other similar classes as noexcept in the SYCL spec so that we can clean up our code here.


void *allocateMem(ContextImplPtr Context, bool InitFromUserData,
void *HostPtr, ur_event_handle_t &OutEventToWait) override;
Expand Down
7 changes: 6 additions & 1 deletion sycl/source/event.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,12 @@ event::get_profiling_info() const {

#undef __SYCL_PARAM_TRAITS_SPEC

backend event::get_backend() const noexcept { return getImplBackend(impl); }
backend event::get_backend() const noexcept try {
return getImplBackend(impl);
} catch (std::exception &e) {
__SYCL_REPORT_EXCEPTION_TO_STREAM("exception in get_backend", e);
std::abort();
}

ur_native_handle_t event::getNative() const { return impl->getNative(); }

Expand Down
Loading