Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
galtimur authored Jun 6, 2024
1 parent d5f1e42 commit da9d3e5
Showing 1 changed file with 6 additions and 3 deletions.
9 changes: 6 additions & 3 deletions ci-builds-repair/ci-builds-repair-benchmark/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,13 @@ To initialize the benchmark, you need to pass a path to a config file with the f
`out_folder`: the path to where the result files will be stored;
`data_cache_dir`: the path to where the cached dataset will be stored;
`username_gh`: your GitHub username;
`test_username`: _Optional_. Username that would be displayed in the benchmark, if ommitted, `username_gh` will be used;
`test_username`: _Optional_. Username that would be displayed in the benchmark, if omitted, `username_gh` will be used;
`language`: dataset language (for now, only Python is available).

## 🏟️ Benchmark usage

**Important**: Before usage, please request to be added to the benchmark [organization]([https://huggingface.co/datasets/JetBrains-Research/lca-ci-builds-repair](https://github.com/orgs/LCA-CI-fix-benchmark) on Github to be able to push the repos for the test.

For the example of the benchmark usage code, see the [`run_benchmark.py`](run_benchmark.py) script.
To use the benchmark, you need to pass a function `fix_repo_function` that fixes the build according to
the repository state on a local machine, logs, and the metadata of the failed workflows.
Expand All @@ -33,6 +34,8 @@ For now, only two functions have been implemented:
`fix_none` β€” does nothing;
`fix_apply_diff` β€” applies the diff that fixed the issue in the original repository;

You can download the dataset using the `CIFixBenchmark.get_dataset()` method.

## πŸš€ Evaluate the baseline

The method `CIFixBenchmark.eval_dataset(fix_repo_function)` evaluates the baseline function. Specifically, it:
Expand All @@ -48,12 +51,12 @@ For debugging, please limit yourself to a small number of datapoints (argument `

The evaluation method outputs the following results:

1. `jobs_ids.jsonl` β€” identifiers of the jobs that were sent to GitHub. They are used for the further evaluation.
1. `jobs_ids.jsonl` β€” identifiers of the jobs that were sent to GitHub. They are used for further evaluation.
2. `jobs_results.jsonl` β€” results of each job.
3. `jobs_awaiting.jsonl` β€” list of awaiting jobs (normally should be empty).
3. `jobs_invalid.jsonl` β€” list of invalid jobs (normally should be empty).

Examples of these files can be found in the (`/examples`)[examples/] directory.

You can also evaluate your results using the method `CIFixBenchmark.eval_jobs(result_filename=result_filename)`,
passing the `jobs_ids.jsonl` file. You can download the dataset using the `CIFixBenchmark.get_dataset()` method.
passing the `jobs_ids.jsonl` file.

0 comments on commit da9d3e5

Please sign in to comment.