Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updates to get module maps used as ES products jointly with CMSSW #355

Merged
merged 20 commits into from
Feb 2, 2024

Conversation

slava77
Copy link
Contributor

@slava77 slava77 commented Nov 14, 2023

coupled with updates in SegmentLinking/cmssw#16

  • split somewhat heavy methods related to filling module data from Module.h to ModuleMethods
  • move LST::loadMaps to a free function to avoid making an instance of LST where it's not needed
  • add LST::loadAndFillES method to be called by an CMSSW ES producer
  • const pointers in modules SoA data

@slava77
Copy link
Contributor Author

slava77 commented Nov 17, 2023

now rebased

Copy link
Contributor

@VourMa VourMa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how ready this was but I took a look and have just a small question here. More questions on the CMSSW side...

SDL/LST.cc Outdated
Comment on lines 9 to 14
void SDL::LST::eventSetup() {
static std::once_flag mapsLoaded;
std::call_once(mapsLoaded, &SDL::LST::loadMaps, this);
std::call_once(mapsLoaded, &SDL::LST::loadMaps);

TString path = get_absolute_path_after_check_file_exists(
TString::Format("%s/data/centroid_CMSSW_12_2_0_pre2.txt",TrackLooperDir_.Data()).Data());
TString::Format("%s/data/centroid_CMSSW_12_2_0_pre2.txt",trackLooperDir().Data()).Data());
static std::once_flag modulesInited;
std::call_once(modulesInited, SDL::initModules, path);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be deleted, now that the maps are loaded in the ESProducer? Or has it stayed until we know that the ESRpoducer method works and will be deleted in the future?

@slava77
Copy link
Contributor Author

slava77 commented Dec 29, 2023

@ariostas
please check the linter error report.

error: invalid argument '-std=c++17' not allowed with 'C' looks like the reason for error: cannot use 'try' with exceptions disabled

I was expecting a proposed patch file with changes (what's usually done with cmssw setup).
Instead the bad formatting comes out with comments like

File SDL/Constants.h does not conform to Custom style guidelines. (lines 14, 66)

which are not very useful/practical.
I'm not sure though if the '-std=c++17' not allowed have prevented something to complete.

@slava77
Copy link
Contributor Author

slava77 commented Dec 30, 2023

error: invalid argument '-std=c++17' not allowed with 'C' looks like the reason for error: cannot use 'try' with exceptions disabled

I got rid of it by adding --language=c++

@ariostas
Copy link
Member

ariostas commented Jan 3, 2024

Hi @slava77, the error: invalid argument '-std=c++17' error is not preventing anything from running. It's an annoying thing that I explain more in a comment on your PR #360.

Regarding the patch file, I thought that it would be the same amount of work to apply the patch file vs to just run clang-format manually, so I didn't think it was worth it. But I can implement generating a patch file, if you prefer.

@slava77
Copy link
Contributor Author

slava77 commented Jan 3, 2024

Regarding the patch file, I thought that it would be the same amount of work to apply the patch file vs to just run clang-format manually, so I didn't think it was worth it. But I can implement generating a patch file, if you prefer.

we should at least have the commands needed to run clang-format and clang-tidy in the readme.
While clang-format seems trivial (it only needs the file name), the clang-tidy is not trivial

@slava77
Copy link
Contributor Author

slava77 commented Jan 3, 2024

Hi @slava77, the error: invalid argument '-std=c++17' error is not preventing anything from running. It's an annoying thing that I explain more in a comment on your PR #360.

do you mean that we should accept that all linter actions will fail?

I noticed also in the logs of the linter the following:

  INFO:CPP Linter:Got 403 response from POSTing comment
  ERROR:CPP Linter:response returned 403 message: {"message":"Resource not accessible by integration","documentation_url":"https://docs.github.com/rest/issues/comments#create-an-issue-comment"}

is it instead the reason for failing the linter action (instead of the stdc++17 thing)?

@slava77
Copy link
Contributor Author

slava77 commented Jan 3, 2024

/run cmssw from-CMSSW_13_3_0_pre3_LST_X/modules-dev

@slava77
Copy link
Contributor Author

slava77 commented Jan 3, 2024

/run standalone

@slava77
Copy link
Contributor Author

slava77 commented Jan 3, 2024

I don't see any activity in the CI frame after my /run commands.
@ariostas are they blocked by the failed linter or does it take longer to react?

@ariostas
Copy link
Member

ariostas commented Jan 3, 2024

@slava77 oh sorry, that's a bug I have to fix. What happened was that I set it so that only one job runs at a time, to prevent us from accidentally running multiple iterations of the same thing. Try writing both commands on a single comment (in separate lines), and then let's check if they run.

@slava77
Copy link
Contributor Author

slava77 commented Jan 3, 2024

The /run action actually came back to me in https://github.com/SegmentLinking/TrackLooper/actions/runs/7399973283/job/20132579034; although I still do not see it in this PR thread.

It came back with what's probably from

/run cmssw from-CMSSW_13_3_0_pre3_LST_X/modules-dev
Branch name is invalid. Ignoring...

not sure what I was supposed to pass instead to pick up SegmentLinking/cmssw#16

the errors from the CI are pretty hard to decode

@slava77
Copy link
Contributor Author

slava77 commented Jan 3, 2024

The PR was built and ran successfully with CMSSW. Here are some plots.

@ariostas
did you fix something or did I misunderstand the previous CI reports that said the jobs were canceled or had problems?

Are the plots meant to have no reference? It would be much better to have a comparison

@slava77
Copy link
Contributor Author

slava77 commented Jan 3, 2024

/run standalone

@slava77 slava77 marked this pull request as ready for review January 3, 2024 23:09
@slava77
Copy link
Contributor Author

slava77 commented Jan 3, 2024

/run standalone
/run cmssw from-CMSSW_13_3_0_pre3_LST_X/modules-dev

@slava77
Copy link
Contributor Author

slava77 commented Jan 4, 2024

cmssw on 100 events look OK now
eff/fr
image

hits
image

I think that the tiny differences are from reproducibility on GPU.

@slava77
Copy link
Contributor Author

slava77 commented Jan 10, 2024

/run cmssw from-CMSSW_13_3_0_pre3_LST_X/modules-dev

@slava77
Copy link
Contributor Author

slava77 commented Jan 10, 2024

@ariostas
is there a way to rerun the linter? I tried to rerun from the last attempt interface, but it used the old commit/variant before your latest updates.

@ariostas
Copy link
Member

@slava77 the linter workflow is triggered differently and github uses the workflow file present in the current branch. So you would have to rebase, but it's probably not worth it.

@ariostas
Copy link
Member

ariostas commented Jan 10, 2024

I think the cmssw test will fail because I haven't updated the recipe to use lst_headers.xml. Now that I think about it I could have updated it without breaking the action for the current default cmssw branch. It's an easy fix so I'll do it tonight.

Copy link

There was a problem while building and running with CMSSW. The logs can be found here.

@SegmentLinking SegmentLinking deleted a comment from github-actions bot Jan 11, 2024
@SegmentLinking SegmentLinking deleted a comment from github-actions bot Jan 11, 2024
Copy link

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks
Efficiency and fake rate vs pT, eta, and phi

The full set of validation and comparison plots can be found here.

@@ -1,7 +1,8 @@
#include "Event.h"

SDL::modules* SDL::modulesInGPU = new SDL::modules();
SDL::modulesBuffer<Acc>* SDL::modulesBuffers = new SDL::modulesBuffer<Acc>(devAcc);
SDL::modulesBuffer<SDL::Dev>* SDL::modulesBuffers = new SDL::modulesBuffer<SDL::Dev>(devAcc);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you changed Acc to SDL::Dev here, but you should do this for all of our objects to be consistent I think. I don't know if this matters since the performance plots look unchanged and this uses SDL::Dev while all the other Buffers still have Acc, but good to be consistent.

Same in all of the header files that define a buffer, to remove confusion we should move the "Acc" named templated parameter to Dev so it's clear what we pass in. See the random examples below.

miniDoubletsBuffers = new SDL::miniDoubletsBuffer<Acc>(nTotalMDs, nLowerModules, devAcc, queue);

template <typename TAcc>
struct miniDoubletsBuffer : miniDoublets {
Buf<TAcc, unsigned int> nMemoryLocations_buf;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this matters since the performance plots look unchanged and this uses SDL::Dev while all the other Buffers still have Acc, but good to be consistent.

right, Acc or Dev does not seem to matter for the buffers at the final implementation level (after all the template parsing is done); however code exposed to CMSSW needs to be templated from a Dev.

Dev seems to be more appropriate based on Alpaka docs for a buffer https://alpaka.readthedocs.io/en/latest/_images/structure_assoc.png

Copy link
Contributor

@VourMa VourMa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments, mostly for my own understanding, rather than real actions. I understand that there are two open issues with this PR:

  1. It has a conflict.
  2. We should replace Acc -> SDL::Dev

I will approve and merge when 1) is resolved, while 2) will be followed up in a new PR (do we need an issue or is it going to be followed up soon?).

@@ -18,12 +18,12 @@ namespace SDL {

ALPAKA_FN_ACC ALPAKA_FN_INLINE void rmPixelTripletFromMemory(struct SDL::pixelTriplets& pixelTripletsInGPU,
unsigned int pixelTripletIndex) {
pixelTripletsInGPU.isDup[pixelTripletIndex] = 1;
pixelTripletsInGPU.isDup[pixelTripletIndex] = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why this change? Did it cause an issue? Is this defined as a bool variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was from a clang check; a part missed in #356

SDL/LST.cc Outdated
exit(2);
void loadMaps() {
// Module orientation information (DrDz or phi angles)
auto const& tldir = trackLooperDir();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of defining this, if trackLooperDir() is used almost everywhere in the following? Was it meant to be a short version but then it was forgotten?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed now

@slava77
Copy link
Contributor Author

slava77 commented Feb 1, 2024

/run standalone
/run cmssw from-CMSSW_13_3_0_pre3_LST_X/modules-dev

@slava77
Copy link
Contributor Author

slava77 commented Feb 1, 2024

We should replace Acc -> SDL::Dev

done now in eca2c59

Copy link

github-actions bot commented Feb 1, 2024

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

Efficiency vs pT comparison Efficiency vs eta comparison
Fake rate vs pT comparison Fake rate vs eta comparison
Duplicate rate vs pT comparison Duplicate rate vs eta comparison

The full set of validation and comparison plots can be found here.

Copy link

github-actions bot commented Feb 1, 2024

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks
Efficiency and fake rate vs pT, eta, and phi

The full set of validation and comparison plots can be found here.

Copy link
Contributor

@VourMa VourMa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is out of @slava77's way

@VourMa VourMa merged commit d79218c into SegmentLinking:master Feb 2, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants