Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Cascaded algorithm correctness depends on compiler flag #107

Open
PuPuHX opened this issue Aug 16, 2024 · 3 comments
Open

[BUG] Cascaded algorithm correctness depends on compiler flag #107

PuPuHX opened this issue Aug 16, 2024 · 3 comments
Assignees
Labels
bug Something isn't working inactive-30d

Comments

@PuPuHX
Copy link

PuPuHX commented Aug 16, 2024

Describe the bug
On branch-2.2, compiling the Cascaded algorithm with the -G flag changes the static shared memory alignment behavior, causing misalignment errors.

Steps/Code to reproduce bug

  1. Modify CMakeLists.txt:
set(CMAKE_CUDA_FLAGS_DEBUG "${CMAKE_CUDA_FLAGS_DEBUG};-g")
set(CMAKE_CUDA_FLAGS_DEBUG "${CMAKE_CUDA_FLAGS_DEBUG};-G").
  1. Run tests/test_cascaded.cpp with the following test case:
TEST_CASE("comp/decomp cascaded-small-uint64", "[nvcomp][small]").

To compile successfully, I reduced the default_chunk_size in default_chunk_size from 4096 to 2048.

Expected behavior
The test should pass without errors.

Environment details (please complete the following information):

  • Environment location:

  • Ubuntu-22.04

  • Driver Version: 555.99

  • CUDA Version: 12.5

  • NVIDIA GeForce RTX 3080

  • Method of nvCOMP install: branch-2.2 source code

Additional context

After debugging, I explicitly declared alignment for shared memory allocation in the following files:

CascadedHlifKernels.cu:122

__shared__ __align__(sizeof(data_type)) uint8_t shmem[shmem_size];

CascadedKernels.cuh:801

__shared__ __align__(sizeof(data_type)) uint32_t chunk_metadata[max_chunk_metadata_size / sizeof(uint32_t)];

After making these changes, all tests in test_cascaded.cpp passed. I believe this dependency on compiler optimization for correctness is a bug.

@PuPuHX PuPuHX added ? - Needs Triage bug Something isn't working labels Aug 16, 2024
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@ndickson-nvidia
Copy link
Contributor

Thanks for the bug report. As far as I can tell, this bug was most likely fixed in nvCOMP 2.6.0 or 2.6.1, in January 2023.

Copy link

github-actions bot commented Dec 4, 2024

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working inactive-30d
Projects
None yet
Development

No branches or pull requests

3 participants