Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple LHA benchmark workflow #227

Merged
merged 16 commits into from
Mar 31, 2023
Merged

Simple LHA benchmark workflow #227

merged 16 commits into from
Mar 31, 2023

Conversation

giacomomagni
Copy link
Collaborator

This PR is to introduce a simple LHA benchmark workflow to check if benchmarks are working before merging on Mater.

@giacomomagni giacomomagni added the benchmarks Benchmark (or infrastructure) related label Mar 22, 2023
@felixhekhorn
Copy link
Contributor

I think something like 1dc2fde can work

@giacomomagni
Copy link
Collaborator Author

Thanks @felixhekhorn for taking care of the pytest markers.
Shall we add also the label trick to run the workflow ?

@felixhekhorn
Copy link
Contributor

Shall we add also the label trick to run the workflow ?

Shouldn't be complicated, right? so why not ... this way we can keep track in the PR

@alecandido
Copy link
Member

alecandido commented Mar 22, 2023

Shouldn't be complicated, right? so why not ... this way we can keep track in the PR

It is rather simple, but which was the counter-argument for using workflow_dispatch instead?

@felixhekhorn
Copy link
Contributor

Should we drop the if __name__ in the lha file? the new syntax to run is NUMBA_DISABLE_JIT=0 poe lha_bot -m "ffns and nlo and not sv" - the only thing you can not access in this way is the now-called CommonRunner to compare LHA settings against APFEL/Pegasus - however, I would consider this as an extra thing; I know we needed for the paper but this is just on top

@alecandido
Copy link
Member

the new syntax to run is NUMBA_DISABLE_JIT=0 poe lha_bot -m "ffns and nlo and not sv"

If we're using this by default, consider adding both the env var and the marker string to the poe script, such that just poe lha_bot will run the exact same thing of the CI.

@felixhekhorn
Copy link
Contributor

Shouldn't be complicated, right? so why not ... this way we can keep track in the PR

It is rather simple, but which was the counter-argument for using workflow_dispatch instead?

The thing would be more visible in the PR - else it would be hidden somewhere in the "crosses and checkmarks" somewhere, no?

@alecandido
Copy link
Member

The thing would be more visible in the PR - else it would be hidden somewhere in the "crosses and checkmarks" somewhere, no?

Is the label thingy showing up somewhere else than in the "cross and checkmarks"? I wouldn't see where...

@felixhekhorn
Copy link
Contributor

the new syntax to run is NUMBA_DISABLE_JIT=0 poe lha_bot -m "ffns and nlo and not sv"

If we're using this by default, consider adding both the env var and the marker string to the poe script, such that just poe lha_bot will run the exact same thing of the CI.

nono, the workflow will of course run everything (so no markers there - this was just an example for people ;-) ) - instead about the Numba hack I'm not sure... should we keep outside to let the user decide or hard-coded force inside

@alecandido
Copy link
Member

instead about the Numba hack I'm not sure... should we keep outside to let the user decide or hard-coded force inside

Uhm... maybe we should but the env var by default as what we are using in the workflow.
If you explicitly specify an env var in the CLI you should be able to overwrite the one set by poe, isn't it?

@felixhekhorn
Copy link
Contributor

The thing would be more visible in the PR - else it would be hidden somewhere in the "crosses and checkmarks" somewhere, no?

Is the label thingy showing up somewhere else than in the "cross and checkmarks"? I wouldn't see where...

it is shown in the PR discussion as e.g. here NNPDF/nnpdf#1604

@alecandido
Copy link
Member

alecandido commented Mar 22, 2023

If you explicitly specify an env var in the CLI you should be able to overwrite the one set by poe, isn't it?

Just tried, it's working.
So, I'd set

lha_bot.env.NUMBA_DISABLE_JIT.default = "0"

as we're doing for the other commands :)

@giacomomagni
Copy link
Collaborator Author

I'm fine both with the label or workflow_dispatch, but we need at least one of the two.
In the end also if the log is posted here but you don't inspect it carefully you will not see any problem.
What we are gaining here is just a thing that does the painful job of running.

@alecandido
Copy link
Member

Do we actually really need to run the full LHA benchmark in the CI? Is actually taking forever...

I believe the most useful thing to be running something, instead of nothing. But I'm not sure we need so much to run everything.

@felixhekhorn
Copy link
Contributor

Do we actually really need to run the full LHA benchmark in the CI? Is actually taking forever...

I believe the most useful thing to be running something, instead of nothing. But I'm not sure we need so much to run everything.

yes - I believe we want everything here! This is to check at the end of a PR that we are not breaking, so it is a one-time thing (as said I expect >30min). Developers might not want to do this, but they maybe want to run only a specific case on which they are worried. Let me remind you that I did run something before #215 was detected, but simply not the case of VFNS+SV, so we better check everything ...

Just as an intermediate result:

Matching: computing operators - 1/60 took: 913.479292 s
Matching: computing operators - 9/60 took: 915.988047 s

also I wonder whether we run into problems with the output length at github - we may want to suppress the eko logger

@giacomomagni
Copy link
Collaborator Author

yes - I believe we want everything here! This is to check at the end of a PR that we are not breaking, so it is a one-time

I agree, let's run everything.

also I wonder whether we run into problems with the output length at github - we may want to suppress the eko logger

Thant might be helpful to facilitate reading the log as well.

@felixhekhorn
Copy link
Contributor

felixhekhorn commented Mar 22, 2023

FYI: in the end we took 1h 6m 46s (much more then I anticipated) - using 2 cores, it seems

break down of timings
2023-03-22T10:58:10.5017232Z Evolution: Total time 358.056409 s
2023-03-22T10:58:49.3314130Z Evolution: Total time 38.824099 s
2023-03-22T11:15:08.3305797Z Matching: Total time 975.218841 s
2023-03-22T11:16:14.4545783Z Evolution: Total time 65.882501 s
2023-03-22T11:17:20.9127924Z Matching: Total time 66.450009 s
2023-03-22T11:18:24.7667271Z Evolution: Total time 63.836674 s
2023-03-22T11:19:09.1847778Z Matching: Total time 43.576919 s
2023-03-22T11:20:39.7785316Z Evolution: Total time 90.581469 s
2023-03-22T11:21:28.7760442Z Matching: Total time 48.992759 s
2023-03-22T11:22:56.3235367Z Evolution: Total time 87.532883 s
2023-03-22T11:24:04.0358522Z Matching: Total time 66.861960 s
2023-03-22T11:25:14.9223873Z Evolution: Total time 70.870485 s
2023-03-22T11:26:22.7901472Z Matching: Total time 67.863977 s
2023-03-22T11:27:29.7086964Z Evolution: Total time 66.897086 s
2023-03-22T11:28:36.6210555Z Matching: Total time 66.107466 s
2023-03-22T11:29:44.4969861Z Evolution: Total time 67.855881 s
2023-03-22T11:30:52.2993668Z Matching: Total time 67.798800 s
2023-03-22T11:31:59.9063581Z Evolution: Total time 67.581929 s
2023-03-22T11:32:45.1620536Z Matching: Total time 44.433201 s
2023-03-22T11:34:19.3869061Z Evolution: Total time 94.207647 s
2023-03-22T11:35:08.4346439Z Matching: Total time 49.044151 s
2023-03-22T11:36:37.2946893Z Evolution: Total time 88.834599 s
2023-03-22T11:37:24.3219924Z Matching: Total time 46.231567 s
2023-03-22T11:38:56.2464293Z Evolution: Total time 91.905251 s
2023-03-22T11:39:46.2100440Z Matching: Total time 49.959722 s
2023-03-22T11:41:14.6790916Z Evolution: Total time 88.444506 s
2023-03-22T11:41:56.7201020Z Evolution: Total time 41.188942 s
2023-03-22T11:43:05.2198794Z Evolution: Total time 67.767023 s
2023-03-22T11:44:38.0395218Z Evolution: Total time 91.891723 s
2023-03-22T11:45:47.9263150Z Evolution: Total time 69.092551 s
2023-03-22T11:46:55.4763803Z Evolution: Total time 66.833618 s
2023-03-22T11:48:31.4566674Z Evolution: Total time 95.215428 s
2023-03-22T11:50:03.1130739Z Evolution: Total time 90.887846 s
2023-03-22T11:50:51.4144967Z Evolution: Total time 47.537175 s
2023-03-22T11:52:11.7863777Z Evolution: Total time 79.700433 s
2023-03-22T11:54:00.9937567Z Evolution: Total time 108.539054 s
2023-03-22T11:55:24.1766091Z Evolution: Total time 82.395384 s
2023-03-22T11:56:44.0157311Z Evolution: Total time 79.161288 s
2023-03-22T11:58:38.8388360Z Evolution: Total time 114.128660 s
2023-03-22T11:58:45.9244135Z INFO     eko.evolution_operator:__init__.py:971 Evolution: Total time 108.539054 s
2023-03-22T11:58:45.9309519Z INFO     eko.evolution_operator:__init__.py:971 Evolution: Total time 114.128660 s

@giacomomagni
Copy link
Collaborator Author

giacomomagni commented Mar 22, 2023

FYI: in the end we took 1h 6m 46s (much more then I anticipated) - using 2 cores, it seems

what if we run everything but only at NNLO (and NLO for polarised)? In the end if something do not shows up there it would be strange that is broken at only lower orders...

@felixhekhorn
Copy link
Contributor

FYI: in the end we took 1h 6m 46s (much more then I anticipated) - using 2 cores, it seems

what if we run everything but only at NNLO (and NLO for polarised)? In the end if something do not shows up there it would be strange that is broken at only lower orders...

I would still run everything ... as said should be a one-time prize ...

instead, the thing that worries me is
grafik
where it seems I triggered a run - but how? I didn't press any button, I only committed and commented ... also you can see sometime we manage in 45 min

more over, I'm not sure we want to have pull_request_review in the workflow - I guess this is also triggered on just commenting (PR review type comment)? and still we would need to be much more careful with providing feedback, because it becomes expensive - and that is a direction we surely do not want ... also "push to master" - this means we should avoid hot-fixing at all cost, which maybe is not ideal either ... maybe a simple button is just enough ...

@felixhekhorn
Copy link
Contributor

so ...

  1. I triggered run 6 by adding this reply to a requested change by @alecandido - this is something we do not want! (I'm guessing I triggered run 8 by resolving a change request) as said above, I suggest to remove that trigger event
  2. I believe now with the new commits we can pass the check - but I don't know how to trigger it manually ... looking at the "Actions" tab there is no simple button I can press ...

@alecandido
Copy link
Member

alecandido commented Mar 22, 2023

I would still run everything ... as said should be a one-time prize ...

It isn't. Cut down useless things.

You don't need all of them to find out if it's working or not. And we'd like to know in less than 1h.

The problem is not triggering many times a workflow, but having a workflow that lasts 1h :)

@alecandido alecandido merged commit 0a82abb into master Mar 31, 2023
@felixhekhorn felixhekhorn mentioned this pull request Apr 3, 2023
@giacomomagni giacomomagni deleted the lha_bench_workflow branch April 7, 2023 09:53
@felixhekhorn felixhekhorn mentioned this pull request May 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmarks Benchmark (or infrastructure) related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants