-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open issues regarding the ECAL local reconstruction on GPU #32480
Comments
Update the ECAL ESProducts and move them to a more correct placeMake the equivalent changes regarding to the conditions that were done for HCAL in #32039. Move the conditions types used on the GPU from
and the corresponding Update them to use
|
Migrate GPU code to use common constants and functionsThe functions in RecoLocalCalo/EcalRecProducers/plugins/Common.h are only used by the GPU implementation. They could be reused from DataFormats/EcalDigi/interface/EcalMGPASample.h. |
A new Issue was created by @fwyzard Andrea Bocci. @Dr15Jones, @dpiparo, @silviodonato, @smuzaffar, @makortel, @qliphy can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
assign reconstruction |
assign heterogeneous |
@thomreis FYI |
Add references for magic numbers
The original CPU code should be updated (for example, moving those numbers to a central place and adding comments to explain their source) and the GPU code should then be updated accordingly. |
Taking note of a few more specific todos:
|
Hi @fwyzard is the change of the types and to the cms::cuda functions supposed to work for all occurrences? For some (like the ones in EcalGainRatiosGPU.h) this compiles fine but for others (e.g. in EcalMultifitParametersGPU.h) I get compilation errors like this one:
and also this:
|
Is |
No it is |
The second error also suggests |
I think the current code is this: struct Product {
~Product();
double *amplitudeFitParametersEB, *amplitudeFitParametersEE, *timeFitParametersEB, *timeFitParametersEE;
}; and (just a fragment): // malloc
cudaCheck(cudaMalloc((void**)&product.amplitudeFitParametersEB,
this->amplitudeFitParametersEB_.size() * sizeof(double)));
// transfer
cudaCheck(cudaMemcpyAsync(product.amplitudeFitParametersEB,
this->amplitudeFitParametersEB_.data(),
this->amplitudeFitParametersEB_.size() * sizeof(double),
cudaMemcpyHostToDevice,
cudaStream)); I think it should become something like struct Product {
edm::propagate_const_array<cms::cuda::device::unique_ptr<double[]>> amplitudeFitParametersEB;
edm::propagate_const_array<cms::cuda::device::unique_ptr<double[]>> amplitudeFitParametersEE;
edm::propagate_const_array<cms::cuda::device::unique_ptr<double[]>> timeFitParametersEB;
edm::propagate_const_array<cms::cuda::device::unique_ptr<double[]>> timeFitParametersEE;
}; and: // malloc
amplitudeFitParametersEB = cms::cuda::make_device_unique<double[]>(amplitudeFitParametersEB_.size(), stream);
// transfer
cms::cuda::copyAsync(product.amplitudeFitParametersEB, amplitudeFitParametersEB_, stream); but I've been typing this in here without actually testing, so double check everything ! |
My branch is here: https://github.com/thomreis/cmssw/tree/ecal-local-reco-gpu-fix-issues/CondFormats/EcalObjects In EcalMultifitParametersGPU.h:
In EcalMultifitParametersGPU.cc:
|
I've tried making these changes on top of CMSSW_11_3_0_pre3: diff --git a/RecoLocalCalo/EcalRecAlgos/interface/EcalMultifitParametersGPU.h b/RecoLocalCalo/EcalRecAlgos/interface/EcalMultifitParametersGPU.h
index 56aa0579ff77..a6c0b1c81aa2 100644
--- a/RecoLocalCalo/EcalRecAlgos/interface/EcalMultifitParametersGPU.h
+++ b/RecoLocalCalo/EcalRecAlgos/interface/EcalMultifitParametersGPU.h
@@ -4,6 +4,8 @@
#include <array>
#include "FWCore/ParameterSet/interface/ParameterSet.h"
+#include "FWCore/Utilities/interface/propagate_const_array.h"
+#include "HeterogeneousCore/CUDAUtilities/interface/device_unique_ptr.h"
#ifndef __CUDACC__
#include "HeterogeneousCore/CUDAUtilities/interface/HostAllocator.h"
@@ -13,8 +15,10 @@
class EcalMultifitParametersGPU {
public:
struct Product {
- ~Product();
- double *amplitudeFitParametersEB, *amplitudeFitParametersEE, *timeFitParametersEB, *timeFitParametersEE;
+ edm::propagate_const_array<cms::cuda::device::unique_ptr<double[]>> amplitudeFitParametersEB;
+ edm::propagate_const_array<cms::cuda::device::unique_ptr<double[]>> amplitudeFitParametersEE;
+ edm::propagate_const_array<cms::cuda::device::unique_ptr<double[]>> timeFitParametersEB;
+ edm::propagate_const_array<cms::cuda::device::unique_ptr<double[]>> timeFitParametersEE;
};
#ifndef __CUDACC__
@@ -29,8 +33,10 @@ public:
}
private:
- std::vector<double, cms::cuda::HostAllocator<double>> amplitudeFitParametersEB_, amplitudeFitParametersEE_,
- timeFitParametersEB_, timeFitParametersEE_;
+ std::vector<double, cms::cuda::HostAllocator<double>> amplitudeFitParametersEB_;
+ std::vector<double, cms::cuda::HostAllocator<double>> amplitudeFitParametersEE_;
+ std::vector<double, cms::cuda::HostAllocator<double>> timeFitParametersEB_;
+ std::vector<double, cms::cuda::HostAllocator<double>> timeFitParametersEE_;
cms::cuda::ESProduct<Product> product_;
#endif // __CUDACC__
diff --git a/RecoLocalCalo/EcalRecAlgos/src/EcalMultifitParametersGPU.cc b/RecoLocalCalo/EcalRecAlgos/src/EcalMultifitParametersGPU.cc
index 010da6444b61..149ba92ff170 100644
--- a/RecoLocalCalo/EcalRecAlgos/src/EcalMultifitParametersGPU.cc
+++ b/RecoLocalCalo/EcalRecAlgos/src/EcalMultifitParametersGPU.cc
@@ -1,7 +1,6 @@
-#include "RecoLocalCalo/EcalRecAlgos/interface/EcalMultifitParametersGPU.h"
-
#include "FWCore/Utilities/interface/typelookup.h"
-#include "HeterogeneousCore/CUDAUtilities/interface/cudaCheck.h"
+#include "HeterogeneousCore/CUDAUtilities/interface/copyAsync.h"
+#include "RecoLocalCalo/EcalRecAlgos/interface/EcalMultifitParametersGPU.h"
EcalMultifitParametersGPU::EcalMultifitParametersGPU(edm::ParameterSet const& ps) {
auto const& amplitudeFitParametersEB = ps.getParameter<std::vector<double>>("EBamplitudeFitParameters");
@@ -20,45 +19,20 @@ EcalMultifitParametersGPU::EcalMultifitParametersGPU(edm::ParameterSet const& ps
std::copy(timeFitParametersEE.begin(), timeFitParametersEE.end(), timeFitParametersEE_.begin());
}
-EcalMultifitParametersGPU::Product::~Product() {
- cudaCheck(cudaFree(amplitudeFitParametersEB));
- cudaCheck(cudaFree(amplitudeFitParametersEE));
- cudaCheck(cudaFree(timeFitParametersEB));
- cudaCheck(cudaFree(timeFitParametersEE));
-}
-
EcalMultifitParametersGPU::Product const& EcalMultifitParametersGPU::getProduct(cudaStream_t cudaStream) const {
auto const& product = product_.dataForCurrentDeviceAsync(
cudaStream, [this](EcalMultifitParametersGPU::Product& product, cudaStream_t cudaStream) {
- // malloc
- cudaCheck(cudaMalloc((void**)&product.amplitudeFitParametersEB,
- this->amplitudeFitParametersEB_.size() * sizeof(double)));
- cudaCheck(cudaMalloc((void**)&product.amplitudeFitParametersEE,
- this->amplitudeFitParametersEE_.size() * sizeof(double)));
- cudaCheck(cudaMalloc((void**)&product.timeFitParametersEB, this->timeFitParametersEB_.size() * sizeof(double)));
- cudaCheck(cudaMalloc((void**)&product.timeFitParametersEE, this->timeFitParametersEE_.size() * sizeof(double)));
+ // allocate GPU memory
+ product.amplitudeFitParametersEB = cms::cuda::make_device_unique<double[]>(amplitudeFitParametersEB_.size(), cudaStream);
+ product.amplitudeFitParametersEE = cms::cuda::make_device_unique<double[]>(amplitudeFitParametersEE_.size(), cudaStream);
+ product.timeFitParametersEB = cms::cuda::make_device_unique<double[]>(timeFitParametersEB_.size(), cudaStream);
+ product.timeFitParametersEE = cms::cuda::make_device_unique<double[]>(timeFitParametersEE_.size(), cudaStream);
// transfer
- cudaCheck(cudaMemcpyAsync(product.amplitudeFitParametersEB,
- this->amplitudeFitParametersEB_.data(),
- this->amplitudeFitParametersEB_.size() * sizeof(double),
- cudaMemcpyHostToDevice,
- cudaStream));
- cudaCheck(cudaMemcpyAsync(product.amplitudeFitParametersEE,
- this->amplitudeFitParametersEE_.data(),
- this->amplitudeFitParametersEE_.size() * sizeof(double),
- cudaMemcpyHostToDevice,
- cudaStream));
- cudaCheck(cudaMemcpyAsync(product.timeFitParametersEB,
- this->timeFitParametersEB_.data(),
- this->timeFitParametersEB_.size() * sizeof(double),
- cudaMemcpyHostToDevice,
- cudaStream));
- cudaCheck(cudaMemcpyAsync(product.timeFitParametersEE,
- this->timeFitParametersEE_.data(),
- this->timeFitParametersEE_.size() * sizeof(double),
- cudaMemcpyHostToDevice,
- cudaStream));
+ cms::cuda::copyAsync(product.amplitudeFitParametersEB, amplitudeFitParametersEB_, cudaStream);
+ cms::cuda::copyAsync(product.amplitudeFitParametersEE, amplitudeFitParametersEE_, cudaStream);
+ cms::cuda::copyAsync(product.timeFitParametersEB, timeFitParametersEB_, cudaStream);
+ cms::cuda::copyAsync(product.timeFitParametersEE, timeFitParametersEE_, cudaStream);
});
return product;
} and they seem to build fine. |
Thanks for confirming this. The only difference to my version that I can see is that I have moved the headers to |
A silly thing: is it possible you are still including the old header from |
Bingo! At least for two of the files with issues. I still get some errors on other files. Does the recipe also work for classes that use |
No, it only works for Can you change the type to |
I do not know why it was implemented like that for those classes. Maybe @amassiro does? If it can be changed to |
I don't remember, it could be that at the beginning it was the only option.
|
I do not remember if there was any reason for this (if not documented, prolly no), but just change to use that allocator accordingly (have to make copies, etc...) |
The issue seems to lie here:
|
Types are different, w/ and w/o host allocator
…On Thu, 4 Mar 2021 at 18:26, Thomas Reis ***@***.***> wrote:
The issue seems to lie here:
CMSSW_11_3_0_pre3/src/CondFormats/EcalObjects/src/EcalIntercalibConstantsGPU.cc:7:70: error: no matching function for call to 'std::vector<float, cms::cuda::HostAllocator<float> >::vector(<brace-enclosed initializer list>)'
7 | : valuesEB_{values.barrelItems()}, valuesEE_{values.endcapItems()} {}
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#32480 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABSFUCO3TFSF7DI4GHUDCPLTB67DZANCNFSM4U3BKX2A>
.
|
Yes - that is on purpose.
If there are no downsides, to me it seems better to change the @makortel what do you think ? |
I agree. The only downside I can think of is an additional copy of the data, but I think being able to copy asynchronously benefits more than that. |
Is there a way to make this compile:
with
and
The current compile error is
Or is there need for separate |
No.
If you want to use a single |
Thanks. @amassiro @vkhristenko is this the preferred version? Having a single vector on the GPU? |
For me it's the same, let's decide one strategy and propagate the same format everywhere. |
I have seen both. E.g. |
👍 |
Open issues regarding the ECAL local reconstruction on GPU
The text was updated successfully, but these errors were encountered: