-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to launch kernels (error code invalid argument) #293
Comments
Just added a little debug helper function and manually compiled with:
Output:
cudaFuncSetAttribute seems to choke on evaluator or sizeof(LNet) |
For some devices, it seems that For my device, I have a maximum value of this of 65536, versus the required size of 98304 for |
This problem stems from the fact that your GPU does not support >=96KB of shared mem per block, which is what is currently hardcoded on the HVM, we plan on soon releasing a dynamic mem allocation, for now, in order to fix this, please set the shared mem, from 96KB:
to 48KB:
for your GPU (2070), since it has 7.5 Compute Capability, you have 64KB max shared mem, so i guess you could use 0x1500. closing this since a duplicate of #283 |
Describe the bug
I ran into a bug well trying to use Bend. Well trying to run Bend in "cuda mode" I get the following error:
I was gonna make an issue for this in Bend but, based on the error message, I think it's a HVM bug.
To Reproduce
Steps to reproduce the behavior:
Install Bend. This is how I installed it.
Grab the fib.bend example from the Bend repo HERE. Here is the code:
Expected behavior
This is error/bug did not happen, the output would be this (ideally fast):
Desktop (please complete the following information):
Additional context
n/a
The text was updated successfully, but these errors were encountered: