Automatize the reuse of the CUDA stream of an input product #305
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR addresses #278. Basic idea is that the first
CUDAScopedContext
constructor reading aCUDAProduct<T>
object will re-use the CUDA stream of that product, and all the subsequent ones will create a new CUDA stream.As an example, let's say we have the following modules/products
The product
A
is produced in one CUDA stream. One of theB
,C
, orD
will re-use the same CUDA stream, the other two will create new ones. TheE
will use the same CUDA stream asD
.I believe this approach is the best we can do in a simple way using only local information.