-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kernels read keys and values out of bounds #4
Comments
Thank you for reporting this. I'll file a ticket internally and we'll get it fixed in the next release. |
As I just realized this was reported on the old Parallel Sort sample, a fix to address this will be pushed with the next version of the FidelityFX SDK (which is how we are pushing out most updates to our older features now - https://github.com/GPUOpen-LibrariesAndSDKs/FidelityFX-SDK). Also, in order to keep the GPU code as fast as possible, the fix will likely be done as a check on the NumKeys value at CPU time with an error code returned in the data setup stage. |
Fwiw this sort implementation has been very helpful. Small feature request, it would be great to also have a separate dedicated prefix sum/scan and a parallel device selection / compaction. Both of these are used internally by the radix sorter, but would also be helpful standalone. |
I'll add it to the list of planned improvements to existing samples. Cheers! |
Hello,
I've recently discovered that if the key/value buffers used in the sort are not a multiple of PARALLELSORT_THREADGROUP_SIZE * 4, then the buffers are read out of bounds and undefined behavior can occur.
See these lines below:
FidelityFX-ParallelSort/ffx-parallelsort/FFX_ParallelSort.h
Lines 133 to 137 in 0c53994
No bounds checks are done to SrcBuffer here, causing GPU instability when these are read out of bounds.
Also an issue here:
FidelityFX-ParallelSort/ffx-parallelsort/FFX_ParallelSort.h
Lines 348 to 360 in 0c53994
Later on, the number of keys is checked, but by that point it's too late:
FidelityFX-ParallelSort/ffx-parallelsort/FFX_ParallelSort.h
Lines 369 to 371 in 0c53994
I suspect the fix would be to just check the number of keys before pre-loading the key/value pairs.
Reproducing is simple enough, just run the sort on data that is less than PARALLELSORT_THREADGROUP_SIZE * 4 with GPU-assisted validation that checks out of bounds descriptor reads.
The text was updated successfully, but these errors were encountered: