Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic Allocation Tests #406

Closed
wants to merge 3 commits into from
Closed

Dynamic Allocation Tests #406

wants to merge 3 commits into from

Conversation

GNiendorf
Copy link
Member

@GNiendorf GNiendorf commented May 24, 2024

I see a 5x-10x reduction (from 500k to 50k-100k per event) in the nTotalQuintuplets variable from the changes below from @slava77's suggestion. Using this PR for tests. Obviously no timing difference with the cache enabled, but I do see improvement when it is turned off:

Timing with dynamic allocation (this PR)
Screenshot 2024-05-24 at 4 57 39 PM

Master Timing
Screenshot 2024-05-24 at 4 53 25 PM

@GNiendorf
Copy link
Member Author

/run standalone
/run CMSSW

SDL/Quintuplet.h Outdated
printf("Unhandled case in createEligibleModulesListForQuintupletsGPU! Module index = %i\n", i);
#endif
}
occupancy = 2 * tripletsInGPU.nTriplets[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will look bad; probably more so at high pt.
But we need breakdown plots; CI is not showing them

Copy link
Member Author

@GNiendorf GNiendorf May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are the breakdown plots. I don't see any issues, maybe I'm missing something though. You can also see the breakdown plots from the CI by downloading the full zip file. (The mywork_4d67cfD-PU200/ one is the breakdown for this PR):

https://www.classe.cornell.edu/~gsn27/www/www/PR406/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ehm, I was wrong at least at base/integrated level ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also see the breakdown plots from the CI by downloading the full zip file.

ah OK.

Here are the breakdown plots. I don't see any issues, maybe I'm missing something though.

yes, it looks OK; it takes an effort to see a difference.

The other side (opposite to high pt high occupancy) to worry about is the low end: a module with one triplet can have more than two matching. This may become more visible without PU.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the low occupancy end could be moderated by e.g. 2*n_T3 + 10 ("random" guess for values; I'm still implying some plots need to be made to show the boundary is OK).

Copy link

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

Efficiency vs pT comparison Efficiency vs eta comparison
Fake rate vs pT comparison Fake rate vs eta comparison
Duplicate rate vs pT comparison Duplicate rate vs eta comparison

The full set of validation and comparison plots can be found here.

Here is a timing comparison:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     44.1    322.7    121.6     47.3     95.9    503.3    132.0    157.2    100.5      2.6    1527.3     979.8+/- 262.5     419.2   explicit_cache[s=4] (master)
   avg     44.4    324.4    122.4     49.7     95.0    508.2    131.7    157.6     99.4      4.2    1537.0     984.4+/- 260.8     420.4   explicit_cache[s=4] (this PR)

@slava77
Copy link
Contributor

slava77 commented May 24, 2024

@GNiendorf
did you run this locally?
I'm curious how often do we see a truncation warning with this 2*n_T3 vs the old case limit.
Just counting the warnings may be enough.

@GNiendorf
Copy link
Member Author

GNiendorf commented May 24, 2024

I'll check the warning printouts and post the results soon. I also want to try this method out on the other objects. The 2x number is arbitrary btw. You can set it to 1x but the efficiency goes down by a hair so I figured I would keep it at 2x for now.

Copy link

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks
Efficiency and fake rate vs pT, eta, and phi

The full set of validation and comparison plots can be found here.

@GNiendorf
Copy link
Member Author

GNiendorf commented May 28, 2024

@slava77 Excess alert printouts go way up with these changes, although a lot of these alerts come from the same module numbers. Maybe these are mostly duplicates that get cleaned up later on?

t5 excess warnings (175 events)-
45,927 - with my changes
85 - Master

Here are the current excess alerts for the other objects as a reference.
388,149 - sement excess alert
104,239 - triplet excess alert
6,291 - Minidoublet excess alert

@slava77
Copy link
Contributor

slava77 commented May 28, 2024

Excess alert printouts go way up with these changes, although a lot of these alerts come from the same module numbers. Maybe these are mostly duplicates that get cleaned up later on?

Are the counts on low end of the distribution or somewhat spread out?
I'm curious if my #406 (comment) could help easily

do we have per-module truncated and non-truncated occupancies to be able to get some more detail where to truncate?

@slava77
Copy link
Contributor

slava77 commented May 28, 2024

Maybe these are mostly duplicates that get cleaned up later on?

it could be. The main problem I can expect now would be in variability of results.

Unless there are exact duplicates that contribute significantly; offhand I think we don't have any, but then I seem to recall that there was a recent thread suggestive that there may be.

@GNiendorf
Copy link
Member Author

Are the counts on low end of the distribution or somewhat spread out? I'm curious if my #406 (comment) could help easily

Oh yeah, it looks like the warnings are coming from modules with low occupancy. Seems like your suggestion might be an easy fix.

@GNiendorf
Copy link
Member Author

GNiendorf commented May 28, 2024

I'm not sure what an acceptable amount of cutoff would be, especially since the other objects seem to get cutoff at a much higher rate. Here I added an offset of +30 and the excess printouts go from 45,927 to 3,106. We still see a 5x improvement on most events over the current master for the nQuints allocation size.

@GNiendorf
Copy link
Member Author

/run standalone
/run CMSSW

@SegmentLinking SegmentLinking deleted a comment from github-actions bot May 28, 2024
Copy link

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

Efficiency vs pT comparison Efficiency vs eta comparison
Fake rate vs pT comparison Fake rate vs eta comparison
Duplicate rate vs pT comparison Duplicate rate vs eta comparison

The full set of validation and comparison plots can be found here.

Here is a timing comparison:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     44.0    325.8    123.3     47.3     95.2    504.0    133.7    157.1     99.9      1.6    1531.7     983.8+/- 266.3     417.8   explicit_cache[s=4] (master)
   avg     43.7    324.7    123.6     48.8     96.0    506.8    132.4    157.0     98.7      1.8    1533.5     983.0+/- 261.5     420.0   explicit_cache[s=4] (this PR)

Copy link

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks
Efficiency and fake rate vs pT, eta, and phi

The full set of validation and comparison plots can be found here.

@GNiendorf
Copy link
Member Author

Closing this PR since Tracklooper is discontinued.

@GNiendorf GNiendorf closed this Jun 26, 2024
@VourMa
Copy link
Contributor

VourMa commented Jun 26, 2024

Do you intend to port it to SegmentLinking/cmssw, or is this development discontinued as a whole?

@GNiendorf
Copy link
Member Author

Not in the short-term, since this isn't necessary right now for the low pT stuff. If memory becomes an issue this will have to be revisited.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants