You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running cuSift through nvidias cudamemchecker reveals a out of bounds memory access. I have double checked to ensure that my older version of ExtractSiftDescriptors is identical to the renamed one on master.
safeCall() Runtime API error in file <../../cuSift/cudaSiftH.cu>, line 273 : unspecified launch failure.
Test::canDetectImageFeatures() safeCall() Runtime API error in file <../../cuSift/cudaSiftH.cu>, line 273 : unspecified launch failure.
========= Invalid shared read of size 4
========= at 0x00000128 in d:cuda\include/device_functions.hpp:1562:INTERNAL_44_tmpxft_0000932c_00000000_8_cudaSiftH_cpp1_ii_35e564fe::atomicAdd(float, float)
========= by thread (14,7,0) in block (15,0,0)
========= Address 0x00000444 is out of bounds
========= Device Frame:D:\work\cusift/cudaSiftD.cu:177:ExtractSiftDescriptors(_int64, SiftPoint, int, float) (ExtractSiftDescriptors(__int64, SiftPoint*, int, float) : 0x6598)
========= Saved host backtrace up to driver entry point at kernel launch time
The out of bound accesses seem to be only happening in the Upper Right and Lower Right parts of the extract function:
if (tx<=14) { float grad1 = horf*grad; if (y>=2) { // Upper right float grad2 = iverf*grad1; atomicAdd(buffer + p1 + 8, iangf*grad2);//out of bounds can happen here atomicAdd(buffer + p2 + 8, angf*grad2);//out of bounds can happen here } if (y<=13) { // Lower right float grad2 = verf*grad1; atomicAdd(buffer + p1 + 40, iangf*grad2);//out of bounds can happen here atomicAdd(buffer + p2 + 40, angf*grad2);//out of bounds can happen here } }
A quick workaround is to double the size of the shared buffer to 256 or do an out of bounds check. I cannot determine if there is a bug in the calculation of p1 or p2.
The text was updated successfully, but these errors were encountered:
Running cuSift through nvidias cudamemchecker reveals a out of bounds memory access. I have double checked to ensure that my older version of ExtractSiftDescriptors is identical to the renamed one on master.
safeCall() Runtime API error in file <../../cuSift/cudaSiftH.cu>, line 273 : unspecified launch failure.
Test::canDetectImageFeatures() safeCall() Runtime API error in file <../../cuSift/cudaSiftH.cu>, line 273 : unspecified launch failure.
========= Invalid shared read of size 4
========= at 0x00000128 in d:cuda\include/device_functions.hpp:1562:INTERNAL_44_tmpxft_0000932c_00000000_8_cudaSiftH_cpp1_ii_35e564fe::atomicAdd(float, float)
========= by thread (14,7,0) in block (15,0,0)
========= Address 0x00000444 is out of bounds
========= Device Frame:D:\work\cusift/cudaSiftD.cu:177:ExtractSiftDescriptors(_int64, SiftPoint, int, float) (ExtractSiftDescriptors(__int64, SiftPoint*, int, float) : 0x6598)
========= Saved host backtrace up to driver entry point at kernel launch time
The out of bound accesses seem to be only happening in the Upper Right and Lower Right parts of the extract function:
if (tx<=14)
{
float grad1 = horf*grad;
if (y>=2)
{ // Upper right
float grad2 = iverf*grad1;
atomicAdd(buffer + p1 + 8, iangf*grad2);//out of bounds can happen here
atomicAdd(buffer + p2 + 8, angf*grad2);//out of bounds can happen here
}
if (y<=13)
{ // Lower right
float grad2 = verf*grad1;
atomicAdd(buffer + p1 + 40, iangf*grad2);//out of bounds can happen here
atomicAdd(buffer + p2 + 40, angf*grad2);//out of bounds can happen here
}
}
A quick workaround is to double the size of the shared buffer to 256 or do an out of bounds check. I cannot determine if there is a bug in the calculation of p1 or p2.
The text was updated successfully, but these errors were encountered: