[BUG] Cascaded algorithm correctness depends on compiler flag #107

PuPuHX · 2024-08-16T07:54:03Z

Describe the bug
On branch-2.2, compiling the Cascaded algorithm with the -G flag changes the static shared memory alignment behavior, causing misalignment errors.

Steps/Code to reproduce bug

Modify CMakeLists.txt:

Comment out line CMakeLists.txt:33

set(CMAKE_CUDA_FLAGS_DEBUG "${CMAKE_CUDA_FLAGS_DEBUG};-g")

Uncomment line CMakeLists.txt:32

set(CMAKE_CUDA_FLAGS_DEBUG "${CMAKE_CUDA_FLAGS_DEBUG};-G").

Run tests/test_cascaded.cpp with the following test case:

TEST_CASE("comp/decomp cascaded-small-uint64", "[nvcomp][small]").

To compile successfully, I reduced the default_chunk_size in default_chunk_size from 4096 to 2048.

Expected behavior
The test should pass without errors.

Environment details (please complete the following information):

Environment location:
Ubuntu-22.04
Driver Version: 555.99
CUDA Version: 12.5
NVIDIA GeForce RTX 3080
Method of nvCOMP install: branch-2.2 source code

Additional context

After debugging, I explicitly declared alignment for shared memory allocation in the following files:

CascadedHlifKernels.cu:122

__shared__ __align__(sizeof(data_type)) uint8_t shmem[shmem_size];

CascadedKernels.cuh:801

__shared__ __align__(sizeof(data_type)) uint32_t chunk_metadata[max_chunk_metadata_size / sizeof(uint32_t)];

After making these changes, all tests in test_cascaded.cpp passed. I believe this dependency on compiler optimization for correctness is a bug.

github-actions · 2024-09-15T08:01:21Z

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

ndickson-nvidia · 2024-11-04T16:55:48Z

Thanks for the bug report. As far as I can tell, this bug was most likely fixed in nvCOMP 2.6.0 or 2.6.1, in January 2023.

github-actions · 2024-12-04T17:01:21Z

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

PuPuHX added ? - Needs Triage bug Something isn't working labels Aug 16, 2024

github-actions bot added the inactive-30d label Sep 15, 2024

naveenaero removed ? - Needs Triage inactive-30d labels Nov 4, 2024

naveenaero self-assigned this Nov 4, 2024

github-actions bot added the inactive-30d label Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Cascaded algorithm correctness depends on compiler flag #107

[BUG] Cascaded algorithm correctness depends on compiler flag #107

PuPuHX commented Aug 16, 2024

github-actions bot commented Sep 15, 2024

ndickson-nvidia commented Nov 4, 2024

github-actions bot commented Dec 4, 2024

[BUG] Cascaded algorithm correctness depends on compiler flag #107

[BUG] Cascaded algorithm correctness depends on compiler flag #107

Comments

PuPuHX commented Aug 16, 2024

github-actions bot commented Sep 15, 2024

ndickson-nvidia commented Nov 4, 2024

github-actions bot commented Dec 4, 2024