Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed on running Suodku example with CUDA #9

Open
bywbilly opened this issue Oct 30, 2020 · 8 comments
Open

Failed on running Suodku example with CUDA #9

bywbilly opened this issue Oct 30, 2020 · 8 comments

Comments

@bywbilly
Copy link

Thanks for this project and it is awesome! I installed satnet through pip. But when I run the visual-sudoku examples, this error occurs:
image

@xflash96
Copy link
Member

It seems that the CUDA module is not compiled. What version of PyTorch are you using?
SATNet currently only supports pytorch==1.1.0. The CPP API changed in a later version, and I haven't fixed it yet.

@JellePiepenbrock
Copy link

JellePiepenbrock commented Nov 5, 2020

This errors occurs for me too. I'm running PyTorch 1.1.0.

EDIT: Is there any specific version of CUDA perhaps that is needed?

@JellePiepenbrock
Copy link

The following piece of code is causing the CUDA extension to not be compiled:

In my case, CUDA_HOME is indeed None, so this piece is skipped:

if torch.cuda.is_available() and CUDA_HOME is not None:
    extension = CUDAExtension(
        name = 'satnet._cuda',
        include_dirs = ['./src'],
        sources = [
            'src/satnet.cpp',
            'src/satnet_cuda.cu',
        ],
        extra_compile_args = {
            'cxx': ['-DMIX_USE_GPU', '-g'],
            'nvcc': ['-g', '-restrict', '-maxrregcount', '32', '-lineinfo', '-Xptxas=-v']
        }
    )
    ext_modules.append(extension)

Now, I think this is because the conda / pip version of cudatoolkit is not the entire toolkit, only the parts needed for standard use of PyTorch/TF. The extra compile arguments for nvcc for example will cause an error too, because nvcc is not in the Conda version of cudatoolkit.

@xflash96 , could you confirm you are not using a conda or pip (or similar) install of cudatoolkit?

@bywbilly
Copy link
Author

bywbilly commented Nov 6, 2020

Thanks for the info. I am PyTorch 1.1.0. I found one thing interesting. The speed running on colab (with one GPU) is slower than the speed running on my serve (64 cores). I don't know whether this is normal?

@JellePiepenbrock
Copy link

Could you describe what you did to get your installation working?

Regarding the speed; I think Colab only gives you 2 CPU cores, so that could slow things down quite a bit.

@bywbilly
Copy link
Author

bywbilly commented Nov 6, 2020

Could you describe what you did to get your installation working?

Regarding the speed; I think Colab only gives you 2 CPU cores, so that could slow things down quite a bit.

I didn't make it work on my server with CUDA. So given the fact that my instances are not that large, so I use CPU to finish the training and testing.

@xflash96
Copy link
Member

xflash96 commented Nov 6, 2020 via email

@xflash96
Copy link
Member

Confirmed. NVCC is required for custom CUDA extensions, and the "maxrrregcount" flag is also needed to work on Colab.
For NVCC, it can be installed via $conda install -c conda-forge cudatoolkit-dev
I've added the instruction on the README.md.
(BTW, I've also updated the APIs to match with pytorch:1.7.0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants