-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed on running Suodku example with CUDA #9
Comments
It seems that the CUDA module is not compiled. What version of PyTorch are you using? |
This errors occurs for me too. I'm running PyTorch 1.1.0. EDIT: Is there any specific version of CUDA perhaps that is needed? |
The following piece of code is causing the CUDA extension to not be compiled: In my case, CUDA_HOME is indeed None, so this piece is skipped: if torch.cuda.is_available() and CUDA_HOME is not None:
extension = CUDAExtension(
name = 'satnet._cuda',
include_dirs = ['./src'],
sources = [
'src/satnet.cpp',
'src/satnet_cuda.cu',
],
extra_compile_args = {
'cxx': ['-DMIX_USE_GPU', '-g'],
'nvcc': ['-g', '-restrict', '-maxrregcount', '32', '-lineinfo', '-Xptxas=-v']
}
)
ext_modules.append(extension) Now, I think this is because the conda / pip version of cudatoolkit is not the entire toolkit, only the parts needed for standard use of PyTorch/TF. The extra compile arguments for nvcc for example will cause an error too, because nvcc is not in the Conda version of cudatoolkit. @xflash96 , could you confirm you are not using a conda or pip (or similar) install of cudatoolkit? |
Thanks for the info. I am PyTorch 1.1.0. I found one thing interesting. The speed running on colab (with one GPU) is slower than the speed running on my serve (64 cores). I don't know whether this is normal? |
Could you describe what you did to get your installation working? Regarding the speed; I think Colab only gives you 2 CPU cores, so that could slow things down quite a bit. |
I didn't make it work on my server with CUDA. So given the fact that my instances are not that large, so I use CPU to finish the training and testing. |
I'm using the cudatoolkit that comes from PyTorch's official docker
file: pytorch/pytorch:1.1.0-cuda10.0-cudnn7.5-devel.
The "maxrregcount" argument is needed for older GPUs because NVCC may
overspill the register if not set properly...
I'll take a look at the newer version of the toolkit to see if the argument
can be removed.
…On Fri, Nov 6, 2020 at 6:50 AM JellePiepenbrock ***@***.***> wrote:
The following piece of code is causing the CUDA extension to not be
compiled:
In my case, CUDA_HOME is indeed None, so this piece is skipped:
if torch.cuda.is_available() and CUDA_HOME is not None:
extension = CUDAExtension(
name = 'satnet._cuda',
include_dirs = ['./src'],
sources = [
'src/satnet.cpp',
'src/satnet_cuda.cu',
],
extra_compile_args = {
'cxx': ['-DMIX_USE_GPU', '-g'],
'nvcc': ['-g', '-restrict', '-maxrregcount', '32', '-lineinfo', '-Xptxas=-v']
}
)
ext_modules.append(extension)
Now, I think this is because the conda / pip version of cudatoolkit is not
the entire toolkit, only the parts needed for standard use of PyTorch/TF.
The extra compile arguments for nvcc for example will cause an error too,
because nvcc is not in the Conda version of cudatoolkit.
@xflash96 <https://github.com/xflash96> , could you confirm you are not
using a conda or pip (or similar) install of cudatoolkit?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGNNENDKYUXH6RDIIS7GFDSOPPIBANCNFSM4TFHW7YQ>
.
|
Confirmed. NVCC is required for custom CUDA extensions, and the "maxrrregcount" flag is also needed to work on Colab. |
Thanks for this project and it is awesome! I installed satnet through pip. But when I run the visual-sudoku examples, this error occurs:
The text was updated successfully, but these errors were encountered: