You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Prior to PR #109, CombinatorialSpaces contained two copies - one for CPU, another for CUDA - of a kernel for each of three wedge products. In PR #109, these 6 kernel instances were reduced to 2 by merging them behind a single kernel abstraction: namely @kernel from KernelAbstractions.jl.
The main qualities that inspired this original wedge product PR were:
Increasing maintainability (by de-duplicating code), and
Supporting more backends.
Following technical discussions, however, made some qualities more apparent:
3. More easily handling threading,
4. Performance improvements relating to the above,
6. Abstracting away logic related to looping over indices,
7. Further abstractions that can now bubble up the function-call hierarchy,
8. Performance improvements that kernel fusion can allow, and
9. Downstream code can now more easily swap between backends.
(Among various others.)
So, we should continue kernelizing our binary operators (such as the interior product and Lie derivative) and unary operators (which are computed as sparse matrix-vector multiplications, but could be made more efficient from defining their own kernels).
As an aside, PR #109 did not break the API around the wedge product, but a follow-up PR should perform a refactoring that is less coupled to the “old way” of caching the wedge product. Further, we should continue to examine whether having explicit extensions for different backends is necessary, or whether a more stream-lined process is possible.
The text was updated successfully, but these errors were encountered:
Prior to PR #109, CombinatorialSpaces contained two copies - one for CPU, another for CUDA - of a kernel for each of three wedge products. In PR #109, these 6 kernel instances were reduced to 2 by merging them behind a single kernel abstraction: namely
@kernel
from KernelAbstractions.jl.For example, this:
and this:
became that:
The main qualities that inspired this original wedge product PR were:
Following technical discussions, however, made some qualities more apparent:
3. More easily handling threading,
4. Performance improvements relating to the above,
6. Abstracting away logic related to looping over indices,
7. Further abstractions that can now bubble up the function-call hierarchy,
8. Performance improvements that kernel fusion can allow, and
9. Downstream code can now more easily swap between backends.
(Among various others.)
So, we should continue kernelizing our binary operators (such as the interior product and Lie derivative) and unary operators (which are computed as sparse matrix-vector multiplications, but could be made more efficient from defining their own kernels).
As an aside, PR #109 did not break the API around the wedge product, but a follow-up PR should perform a refactoring that is less coupled to the “old way” of caching the wedge product. Further, we should continue to examine whether having explicit extensions for different backends is necessary, or whether a more stream-lined process is possible.
The text was updated successfully, but these errors were encountered: