Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of bounds memory access in cuSIFT_D.cu ExtractSiftDescriptors_D function #46

Open
adickin opened this issue Jul 6, 2016 · 0 comments

Comments

@adickin
Copy link

adickin commented Jul 6, 2016

Running cuSift through nvidias cudamemchecker reveals a out of bounds memory access. I have double checked to ensure that my older version of ExtractSiftDescriptors is identical to the renamed one on master.

safeCall() Runtime API error in file <../../cuSift/cudaSiftH.cu>, line 273 : unspecified launch failure.
Test::canDetectImageFeatures() safeCall() Runtime API error in file <../../cuSift/cudaSiftH.cu>, line 273 : unspecified launch failure.

========= Invalid shared read of size 4
========= at 0x00000128 in d:cuda\include/device_functions.hpp:1562:INTERNAL_44_tmpxft_0000932c_00000000_8_cudaSiftH_cpp1_ii_35e564fe::atomicAdd(float, float)
========= by thread (14,7,0) in block (15,0,0)
========= Address 0x00000444 is out of bounds
========= Device Frame:D:\work\cusift/cudaSiftD.cu:177:ExtractSiftDescriptors(_int64, SiftPoint, int, float) (ExtractSiftDescriptors(__int64, SiftPoint*, int, float) : 0x6598)
========= Saved host backtrace up to driver entry point at kernel launch time

The out of bound accesses seem to be only happening in the Upper Right and Lower Right parts of the extract function:

if (tx<=14)
{
float grad1 = horf*grad;
if (y>=2)
{ // Upper right
float grad2 = iverf*grad1;
atomicAdd(buffer + p1 + 8, iangf*grad2);//out of bounds can happen here
atomicAdd(buffer + p2 + 8, angf*grad2);//out of bounds can happen here
}
if (y<=13)
{ // Lower right
float grad2 = verf*grad1;
atomicAdd(buffer + p1 + 40, iangf*grad2);//out of bounds can happen here
atomicAdd(buffer + p2 + 40, angf*grad2);//out of bounds can happen here
}
}

A quick workaround is to double the size of the shared buffer to 256 or do an out of bounds check. I cannot determine if there is a bug in the calculation of p1 or p2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant