Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to launch kernels (error code invalid argument) #293

Closed
MehmetMHY opened this issue May 18, 2024 · 3 comments
Closed

Failed to launch kernels (error code invalid argument) #293

MehmetMHY opened this issue May 18, 2024 · 3 comments

Comments

@MehmetMHY
Copy link

Describe the bug
I ran into a bug well trying to use Bend. Well trying to run Bend in "cuda mode" I get the following error:

Error reading result from hvm. Output :
Failed to launch kernels (error code invalid argument)!
exit status: 1

I was gonna make an issue for this in Bend but, based on the error message, I think it's a HVM bug.

To Reproduce
Steps to reproduce the behavior:

  1. Install Bend. This is how I installed it.

  2. Grab the fib.bend example from the Bend repo HERE. Here is the code:

add = λa λb (+ a b)
fib = λx switch x {
  0: 1
  _: let p = x-1; switch p {
    0: 1
    _: (+ (fib p) (fib p-1))
  }
}
main = (fib 30)
  1. Run the following Bend command:
bend run-cu fib.bend

Expected behavior
This is error/bug did not happen, the output would be this (ideally fast):

Result: 1346269

Desktop (please complete the following information):

  • OS: Ubuntu 20.04.6 LTS x86_64
  • CPU: AMD Ryzen 5 3600 (12) @ 3.600GHz
  • GPU: NVIDIA GeForce RTX 2070 SUPER
  • Cuda Version: 12.3, V12.3.52

Additional context
n/a

@janschiefer
Copy link

Just added a little debug helper function and manually compiled with:

bend gen-cu test.bend > test.cu 
nvcc -I/usr/local/cuda/include -L/usr/local/cuda/lib -O0 -g test.cu -o test_cuda
./test_cuda

#include <iostream>

#define CHECK_CUDA_ERROR(val) check((val), #val, __FILE__, __LINE__)
void check(cudaError_t err, const char* const func, const char* const file,
           const int line)
{
    if (err != cudaSuccess)
    {
        std::cerr << "CUDA Runtime Error at: " << file << ":" << line
                  << std::endl;
        std::cerr << cudaGetErrorString(err) << " " << func << std::endl;
        // We don't exit when we encounter CUDA errors in this example.
        // std::exit(EXIT_FAILURE);
    }
}

Output:

CUDA Runtime Error at: test.cu:2597
invalid argument cudaFuncSetAttribute(evaluator, cudaFuncAttributeMaxDynamicSharedMemorySize, sizeof(LNet))
Failed to launch kernels (error code invalid argument)!

cudaFuncSetAttribute seems to choke on evaluator or sizeof(LNet)

@FreezePhoenix
Copy link

For some devices, it seems that sizeof(LNet) exceeds the maximum value for cudaDevAttrMaxSharedMemoryPerBlockOptin

For my device, I have a maximum value of this of 65536, versus the required size of 98304 for LNet.

@kings177
Copy link
Member

This problem stems from the fact that your GPU does not support >=96KB of shared mem per block, which is what is currently hardcoded on the HVM, we plan on soon releasing a dynamic mem allocation, for now, in order to fix this, please set the shared mem, from 96KB:

const u32 L_NODE_LEN = 0x2000;
const u32 L_VARS_LEN = 0x2000;

to 48KB:

const u32 L_NODE_LEN = 0x1000;
const u32 L_VARS_LEN = 0x1000;

for your GPU (2070), since it has 7.5 Compute Capability, you have 64KB max shared mem, so i guess you could use 0x1500.

closing this since a duplicate of #283

@kings177 kings177 closed this as not planned Won't fix, can't repro, duplicate, stale May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants