Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVBLAS #23

Open
maleadt opened this issue Feb 16, 2021 · 5 comments
Open

NVBLAS #23

maleadt opened this issue Feb 16, 2021 · 5 comments

Comments

@maleadt
Copy link
Contributor

maleadt commented Feb 16, 2021

Might be interesting to experiment with NVBLAS: https://docs.nvidia.com/cuda/nvblas/index.html

The NVBLAS Library is a GPU-accelerated Libary that implements BLAS (Basic Linear Algebra Subprograms). It can accelerate most BLAS Level-3 routines by dynamically routing BLAS calls to one or more NVIDIA GPUs present in the system, when the charateristics of the call make it to speedup on a GPU.

Part of CUDA_jll: https://github.com/JuliaBinaryWrappers/CUDA_jll.jl/blob/44445f650547dd14db177336e488460e56d4f354/src/wrappers/x86_64-linux-gnu.jl#L164-L168

@ViralBShah
Copy link
Collaborator

@staticfloat I suppose this is going to be the same as MKL. Forwarding 64_ suffixed BLAS functions to the non-suffixed ones.

@maleadt Are any init/threading NVBLAS specific APIs that need calling. Those will be needed to added here like we did for MKL in #19

@maleadt
Copy link
Contributor Author

maleadt commented Feb 17, 2021

No specific APIs to call. One problem is that this BLAS only supports a limited number of functions, and forwards to another blas itself (configurable via environment variables and a configuration file):

000000000000bfb0 g    DF .text  0000000000000282  libnvblas.so.11 chemm_
000000000000ca10 g    DF .text  0000000000000282  libnvblas.so.11 csyr2k_
0000000000009670 g    DF .text  00000000000002bd  libnvblas.so.11 cgemm_
00000000000090f0 g    DF .text  00000000000002bd  libnvblas.so.11 sgemm_
000000000000cf70 g    DF .text  0000000000000282  libnvblas.so.11 cher2k_
000000000000afb0 g    DF .text  000000000000029c  libnvblas.so.11 ctrsm_
000000000000aa70 g    DF .text  000000000000029c  libnvblas.so.11 strsm_
000000000000a320 g    DF .text  0000000000000250  libnvblas.so.11 zsyrk_
0000000000009e80 g    DF .text  0000000000000250  libnvblas.so.11 dsyrk_
000000000000c240 g    DF .text  0000000000000282  libnvblas.so.11 zhemm_
0000000000009930 g    DF .text  00000000000002bd  libnvblas.so.11 zgemm_
000000000000c4f0 g    DF .text  0000000000000282  libnvblas.so.11 ssyr2k_
000000000000a5b0 g    DF .text  0000000000000250  libnvblas.so.11 cherk_
00000000000093b0 g    DF .text  00000000000002bd  libnvblas.so.11 dgemm_
000000000000b250 g    DF .text  000000000000029c  libnvblas.so.11 ztrsm_
000000000000ad10 g    DF .text  000000000000029c  libnvblas.so.11 dtrsm_
000000000000b530 g    DF .text  0000000000000282  libnvblas.so.11 ssymm_
000000000000ba50 g    DF .text  0000000000000282  libnvblas.so.11 csymm_
000000000000da10 g    DF .text  00000000000002ac  libnvblas.so.11 ctrmm_
000000000000d4b0 g    DF .text  00000000000002ac  libnvblas.so.11 strmm_
000000000000c780 g    DF .text  0000000000000282  libnvblas.so.11 dsyr2k_
000000000000a800 g    DF .text  0000000000000250  libnvblas.so.11 zherk_
000000000000bce0 g    DF .text  0000000000000282  libnvblas.so.11 zsymm_
000000000000dcc0 g    DF .text  00000000000002ac  libnvblas.so.11 ztrmm_
000000000000b7c0 g    DF .text  0000000000000282  libnvblas.so.11 dsymm_
000000000000d760 g    DF .text  00000000000002ac  libnvblas.so.11 dtrmm_
000000000000a0d0 g    DF .text  0000000000000250  libnvblas.so.11 csyrk_
000000000000cca0 g    DF .text  0000000000000282  libnvblas.so.11 zsyr2k_
0000000000009c30 g    DF .text  0000000000000250  libnvblas.so.11 ssyrk_
000000000000d200 g    DF .text  0000000000000282  libnvblas.so.11 zher2k_

This breaks autodetection. Adding some symbol to the list works for suffix detection, but for interface detection that doesn't scale.

[NVBLAS] NVBLAS_CONFIG_FILE environment variable is NOT set : relying on default config filename 'nvblas.conf'
[NVBLAS] Cannot open default config file 'nvblas.conf'
[NVBLAS] Config parsed
[NVBLAS] CPU Blas library need to be provided

@ViralBShah
Copy link
Collaborator

We can make nvblas.conf or the env variable point to the Julia provided openblas.

@ViralBShah
Copy link
Collaborator

@maleadt - Ideally something like this is what we need to try out NVBLAS: https://github.com/JuliaLinearAlgebra/MKL.jl/blob/master/src/MKL.jl#L38

Of course, we'll then find things that don't quite work and perhaps LBT may need to be taught about NVBLAS. I suppose CUDA_jll does not include LAPACK.

@maleadt
Copy link
Contributor Author

maleadt commented Mar 4, 2021

I suppose CUDA_jll does not include LAPACK.

Not a drop-in version like NVBLAS at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants