Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LHA benchmark bot #163

Closed
felixhekhorn opened this issue Nov 15, 2022 · 9 comments
Closed

LHA benchmark bot #163

felixhekhorn opened this issue Nov 15, 2022 · 9 comments
Labels
benchmarks Benchmark (or infrastructure) related

Comments

@felixhekhorn
Copy link
Contributor

I'd like to have a "LHA benchmark bot" that runs upon adding a label (e.g. run-lha-benchmark) and that runs all LHA benchmarks, since we need to conserve them. There is no need to run them on every commit, but I'd like to run at the end of each PR.

The idea is, of course, stolen from nnpdf.

For the moment I asked people expli - citly, but this could be automized ...

@felixhekhorn felixhekhorn added the benchmarks Benchmark (or infrastructure) related label Nov 15, 2022
@alecandido
Copy link
Member

All the benchmarks run can be automated with very little effort. The only problem was how to evaluate the output (when it should raise an error/warning).

I don't know what you mean by "bot", but if you mean "like fitbot" that is only a workflow, like ours, simply triggered by a label event. We can do it as well, or even do something better.
Still, the "end of a PR" is not an event, just because it is ill-defined. Instead of the label, you can trigger by the button coming with workflow_dispatch, it is just the same, and the straight-forward way to get a button (no magic label to know, it is the officially documented button).

@felixhekhorn
Copy link
Contributor Author

  • Indeed something like the "fitbot" ...
  • also a button is just fine, and yes, exactly because it is ill-defined I'd like to keep it manual
  • and the output can for now be just the usual print by ekomark that has to be validated by a user

@felixhekhorn
Copy link
Contributor Author

a workflow can expose some assets after he has run, right?

@alecandido
Copy link
Member

If you acknowledge the impossibility to fail, the alternative is to make a report, like vp ones: you need to compare to something, otherwise is difficult to evaluate, but I believe you can compare against PR base.

So, you need a workflow that:

  • trigger on workflow_dispatch
  • launch two jobs:
    1. checkout PR branch, and run benchmark runners
    2. checkout PR base, and run benchmark runners (this one can be cached with actions/cache)
  • each job upload the results as an artifact
  • a third job is fired when the other two are completed, download the artifacts, produce a comparison, and reupload it as an artifact
  • after that the job (or fourth one) post a message on the PR with the link to the artifact for the comparison

@alecandido
Copy link
Member

Artifacts for the branch benchmark result can have a small retention period (couple of days), while the other two you keep for a longer time (the base for caching purpose, the report for checking). But you still don't need both of them after PR merging, and if PR continues for long, it is worth rerunning (at least you will have to rebase)

@alecandido
Copy link
Member

Actions for artifacts are:

Action for posting on PR:

@felixhekhorn
Copy link
Contributor Author

all this is v3 of what I'm thinking about 🙃 ... in the first stage I don't need a comparison, I know we have to match 0.0xy (I'm only thinking about LHA atm) ... so a simple print is sufficient and that could be done even without any artifact

@alecandido
Copy link
Member

alecandido commented Nov 15, 2022

At least upload the single output as an artifact, scrolling the logs is a pain
(and you can also include the db, that can be used with the navigator)

@felixhekhorn
Copy link
Contributor Author

Closed via #227

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmarks Benchmark (or infrastructure) related
Projects
None yet
Development

No branches or pull requests

2 participants