You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Now that support is in place for rank 1 through rank 4 arrays (fp32 and fp64), it's time to look into supporting GPU acceleration for function evaluation.
To support portability between Nvidia and AMD GPUs, I'm thinking of using AMD's HIP. Because of the status of ROCm and spotty support for Windows (and no support for MacOS), this build feature will need to be optional. Additionally, because some users may be on systems that do not have ROCm installed, and only the CUDA toolkit, we'll need to use pre-processing to map procedures for GPU memory management to either the CUDA or HIP methods.
To do list
Build system
[] Add option for enabling HIP
[] Add option for enabling CUDA
[] Add CUDA and HIP build options to spack package
[] Build with HIP support with fpm ?
[] Build with CUDA support with fpm ?
Compute Kernels
We will need to have the following element-wise functions/operations defined as HIP kernels with 32-bit and 64-bit data for device pointers
[] c = a+b
[] c = a-b
[] c = a*b
[] c = a/b
[] c = a^s (s is a scalar)
[] c = \abs(a)
[] c = \cos(a)
[] c = \sin(a)
[] c = \tan(a)
[] c = \acos(a)
[] c = \asin(a)
[] c = \atan(a)
[] c = \sinh(a)
[] c = \cosh(a)
[] c = \tanh(a)
[] c = \sqrt(a)
[] c = \ln(a) (natural logarithm)
[] c = \log(a) (log base-10)
[] c = -a (sign flip)
The text was updated successfully, but these errors were encountered:
Now that support is in place for rank 1 through rank 4 arrays (fp32 and fp64), it's time to look into supporting GPU acceleration for function evaluation.
To support portability between Nvidia and AMD GPUs, I'm thinking of using AMD's HIP. Because of the status of ROCm and spotty support for Windows (and no support for MacOS), this build feature will need to be optional. Additionally, because some users may be on systems that do not have ROCm installed, and only the CUDA toolkit, we'll need to use pre-processing to map procedures for GPU memory management to either the CUDA or HIP methods.
To do list
Build system
[] Add option for enabling HIP
[] Add option for enabling CUDA
[] Add CUDA and HIP build options to spack package
[] Build with HIP support with fpm ?
[] Build with CUDA support with fpm ?
Compute Kernels
We will need to have the following element-wise functions/operations defined as HIP kernels with 32-bit and 64-bit data for device pointers
[]
c = a+b
[]
c = a-b
[]
c = a*b
[]
c = a/b
[]
c = a^s
(s
is a scalar)[]
c = \abs(a)
[]
c = \cos(a)
[]
c = \sin(a)
[]
c = \tan(a)
[]
c = \acos(a)
[]
c = \asin(a)
[]
c = \atan(a)
[]
c = \sinh(a)
[]
c = \cosh(a)
[]
c = \tanh(a)
[]
c = \sqrt(a)
[]
c = \ln(a)
(natural logarithm)[]
c = \log(a)
(log base-10)[]
c = -a
(sign flip)The text was updated successfully, but these errors were encountered: