Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define two asv configurations for pip and latest-github dependencies #110

Merged
merged 20 commits into from
May 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 33 additions & 14 deletions .github/workflows/test_and_deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -77,27 +77,46 @@ jobs:
benchmarks:
name: Check benchmarks
runs-on: ubuntu-latest

# Set shell in login mode as global setting for the job
defaults:
run:
shell: bash -l {0}

strategy:
matrix:
python-version: ["3.10"]

steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
- name: Checkout brainglobe-workflows repository
uses: actions/checkout@v4

- name: Create and activate conda environment # we need conda for asv management of environments
uses: conda-incubator/[email protected] # see https://github.com/conda-incubator/setup-miniconda/issues/261
with:
python-version: "3.10"
miniconda-version: py310_24.1.2-0 # we need conda<24.3, see https://github.com/airspeed-velocity/asv/pull/1397
python-version: ${{ matrix.python-version }}
activate-environment: asv-only

- name: Install asv
shell: bash
run: |
python -mpip install --upgrade pip
pip install --upgrade pip
pip install asv

- name: Run asv check with pip dependencies
working-directory: ${{ github.workspace }}/benchmarks
run: |

# check benchmarks with pip dependencies
asv check -v --config $GITHUB_WORKSPACE/benchmarks/asv.pip.conf.json

# We install the project to benchmark because we run `asv check` with the `existing` flag.
python -mpip install .
python -mpip install asv
- name: Run asv check
shell: bash
- name: Run asv check with latest-github dependencies
working-directory: ${{ github.workspace }}/benchmarks
run: |
cd benchmarks

# With `existing`, the benchmarked project must be already installed, including all dependencies.
# see https://asv.readthedocs.io/en/v0.6.3/commands.html#asv-check
asv check -v -E existing
# check benchmarks with latest-github dependencies
asv check -v --config $GITHUB_WORKSPACE/benchmarks/asv.latest-github.conf.json


build_sdist_wheels:
name: Build source distribution
Expand Down
29 changes: 18 additions & 11 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,38 @@
# README

## Overview
We use [`asv`](https://asv.readthedocs.io) to benchmark some representative brainglobe workflows. The `asv` workflow is roughly as follows:
We use [`asv`](https://asv.readthedocs.io) to benchmark some representative BrainGlobe workflows. The `asv` workflow is roughly as follows:
1. `asv` creates a virtual environment to run the benchmarks on, as defined in the `asv.conf.json` file.
1. It installs the version of the `brainglobe-workflows` package corresponding to the tip of the locally checked-out branch.
1. It runs the benchmarks as defined (locally) under `benchmarks/benchmarks` and saves the results to `benchmarks/results` as json files.
1. With `asv publish`, the output json files are 'published' into an html directory (`benchmarks/html`).
1. With `asv preview` the html directory can be visualised using a local web server.



We include code to benchmark the workflows defined under `brainglobe_workflows`. There are three main ways in which these benchmarks can be useful to developers:
1. Developers can run the available benchmarks locally [on a small test dataset](#running-benchmarks-locally-on-default-small-dataset).
1. Developers can run these benchmarks locally on [data they have stored locally](#running-benchmarks-locally-on-custom-data).
1. We also plan to run the benchmarks internally on a large dataset, and make the results publicly available.
1. Developers can run the available benchmarks on their machine on either
- [a small test dataset](#running-benchmarks-on-a-small-default-dataset), or
- on [custom data](#running-benchmarks-on-custom-data).
1. We also run the benchmarks internally on a large dataset, and make the results publicly available.

Additionally, we ship two `asv` configuration files, which define two different environments for `asv` to create and run the benchmarks in. `brainglobe-workflows` depends on a number of BrainGlobe packages. The only difference between the two `asv`-defined environments is the version of the BrainGlobe packages. In `asv.pip.conf.json`, we install the packages from PyPI. In `asv.latest-github.conf.json`, we install the packages from their `main` branch on GitHub. Note that because of this all `asv` commands will need to specify the configuration file with the `--config` flag.

See the `asv` [reference docs](https://asv.readthedocs.io/en/stable/reference.html) for further details on the tool, and on [how to run benchmarks](https://asv.readthedocs.io/en/stable/using.html#running-benchmarks). The first time running benchmarks on a new machine, you will need to run `asv machine --yes` to set up the machine for benchmarking.

See the `asv` [reference docs](https://asv.readthedocs.io/en/v0.6.3/reference.html) for further details on the tool, and on [how to run benchmarks](https://asv.readthedocs.io/en/stable/using.html#running-benchmarks).

## Installation

To run the benchmarks, [install asv](https://asv.readthedocs.io/en/stable/installing.html) in your current environment:
```
pip install asv
```
Note that to run the benchmarks, you do not need to install a development version of `brainglobe-workflows` in your current environment (`asv` takes care of this).


## Running benchmarks on a default small dataset
## Running benchmarks on a small default dataset

To run the benchmarks on a default small dataset:
To run the benchmarks on the default dataset:

1. Git clone the `brainglobe-workflows` repository:
```
Expand All @@ -34,11 +41,11 @@ To run the benchmarks on a default small dataset:
1. Run `asv` from the `benchmarks` directory:
```
cd brainglobe-workflows/benchmarks
asv run
asv run --config <path-to-asv-config> # dependencies from PyPI or GitHub, depending on the asv config file used
```
This will benchmark the workflows defined in `brainglobe_workflows/` using a default set of parameters and a default small dataset. The default parameters are defined as config files under `brainglobe_workflows/configs`. The default dataset is downloaded from [GIN](https://gin.g-node.org/G-Node/info/wiki).

## Running benchmarks on custom data available locally
## Running benchmarks on custom data
To run the benchmarks on a custom local dataset:

1. Git clone the `brainglobe-workflows` repository
Expand All @@ -53,7 +60,7 @@ To run the benchmarks on a custom local dataset:
1. Benchmark the workflow, passing the path to your custom config file as an environment variable.
- For example, to benchmark the `cellfinder` workflow, you will need to prepend the environment variable definition to the `asv run` command (valid for Unix systems):
```
CELLFINDER_CONFIG_PATH=/path/to/your/config/file asv run
CELLFINDER_CONFIG_PATH=/path/to/your/config/file asv run --config <path-to-asv-config>
```

## Running benchmarks in development
Expand All @@ -67,5 +74,5 @@ The following flags to `asv run` are often useful in development:

Example:
```
asv run --bench TimeFullWorkflow --dry-run --show-stderr --quick
asv run --config <path-to-asv-config> --bench TimeFullWorkflow --dry-run --show-stderr --quick
```
185 changes: 185 additions & 0 deletions benchmarks/asv.latest-github.conf.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
{
// The version of the config file format. Do not change, unless
// you know what you are doing.
"version": 1,
// The name of the project being benchmarked
"project": "brainglobe-workflows",
// The project's homepage
"project_url": "https://github.com/brainglobe/brainglobe-workflows",
// The URL or local path of the source code repository for the
// project being benchmarked
"repo": "..",
// "repo": "https://github.com/brainglobe/brainglobe-workflows.git",
// The Python project's subdirectory in your repo. If missing or
// the empty string, the project is assumed to be located at the root
// of the repository.
// "repo_subdir": "",
// Customizable commands for building the project.
// See asv.conf.json documentation.
// To build the package using pyproject.toml (PEP518), uncomment the following lines
"build_command": [
"python -m pip install build",
"python -m build",
"PIP_NO_BUILD_ISOLATION=false python -mpip wheel --no-deps --no-index -w {build_cache_dir} {build_dir}"
],
// To build the package using setuptools and a setup.py file, uncomment the following lines
// "build_command": [
// "python setup.py build",
// "PIP_NO_BUILD_ISOLATION=false python -mpip wheel --no-deps --no-index -w {build_cache_dir} {build_dir}"
// ],
// Customizable commands for installing and uninstalling the project.
// See asv.conf.json documentation.
// overwrite dependencies from PyPi with latest `main` version from GitHub
"install_command": [
"in-dir={env_dir} python -mpip install --force-reinstall '{wheel_file}'",
"in-dir={env_dir} python -mpip install -r {conf_dir}/latest-github-requirements.txt",
"in-dir={env_dir} python -mpip list" // print dependencies' versions if asv ran with -v
],
"uninstall_command": [
"return-code=any python -mpip uninstall -y {project}"
],
// List of branches to benchmark. If not provided, defaults to "master"
// (for git) or "default" (for mercurial).
"branches": [
"HEAD"
], // for git
// "branches": ["default"], // for mercurial
// The DVCS being used. If not set, it will be automatically
// determined from "repo" by looking at the protocol in the URL
// (if remote), or by looking for special directories, such as
// ".git" (if local).
// "dvcs": "git",
// The tool to use to create environments. May be "conda",
// "virtualenv", "mamba" (above 3.8)
// or other value depending on the plugins in use.
// If missing or the empty string, the tool will be automatically
// determined by looking for tools on the PATH environment
// variable.
"environment_type": "conda",
// timeout in seconds for installing any dependencies in environment
// defaults to 10 min
//"install_timeout": 600,
// the base URL to show a commit for the project.
"show_commit_url": "https://github.com/brainglobe/brainglobe-workflows/commit/",
// The Pythons you'd like to test against. If not provided, defaults
// to the current version of Python used to run `asv`.
"pythons": [
"3.10"
],
// The list of conda channel names to be searched for benchmark
// dependency packages in the specified order
"conda_channels": [
"conda-forge",
"defaults"
],
// A conda environment file that is used for environment creation.
// "conda_environment_file": "environment.yml",
// The matrix of dependencies to test. Each key of the "req"
// requirements dictionary is the name of a package (in PyPI) and
// the values are version numbers. An empty list or empty string
// indicates to just test against the default (latest)
// version. null indicates that the package is to not be
// installed. If the package to be tested is only available from
// PyPi, and the 'environment_type' is conda, then you can preface
// the package name by 'pip+', and the package will be installed
// via pip (with all the conda available packages installed first,
// followed by the pip installed packages).
//
// The ``@env`` and ``@env_nobuild`` keys contain the matrix of
// environment variables to pass to build and benchmark commands.
// An environment will be created for every combination of the
// cartesian product of the "@env" variables in this matrix.
// Variables in "@env_nobuild" will be passed to every environment
// during the benchmark phase, but will not trigger creation of
// new environments. A value of ``null`` means that the variable
// will not be set for the current combination.
//
// "matrix": {
// "req": {
// "numpy": ["1.6", "1.7"],
// "six": ["", null], // test with and without six installed
// "pip+emcee": [""] // emcee is only available for install with pip.
// },
// "env": {"ENV_VAR_1": ["val1", "val2"]},
// "env_nobuild": {"ENV_VAR_2": ["val3", null]},
// },
// Combinations of libraries/python versions can be excluded/included
// from the set to test. Each entry is a dictionary containing additional
// key-value pairs to include/exclude.
//
// An exclude entry excludes entries where all values match. The
// values are regexps that should match the whole string.
//
// An include entry adds an environment. Only the packages listed
// are installed. The 'python' key is required. The exclude rules
// do not apply to includes.
//
// In addition to package names, the following keys are available:
//
// - python
// Python version, as in the *pythons* variable above.
// - environment_type
// Environment type, as above.
// - sys_platform
// Platform, as in sys.platform. Possible values for the common
// cases: 'linux2', 'win32', 'cygwin', 'darwin'.
// - req
// Required packages
// - env
// Environment variables
// - env_nobuild
// Non-build environment variables
//
// "exclude": [
// {"python": "3.2", "sys_platform": "win32"}, // skip py3.2 on windows
// {"environment_type": "conda", "req": {"six": null}}, // don't run without six on conda
// {"env": {"ENV_VAR_1": "val2"}}, // skip val2 for ENV_VAR_1
// ],
//
// "include": [
// // additional env for python2.7
// {"python": "2.7", "req": {"numpy": "1.8"}, "env_nobuild": {"FOO": "123"}},
// // additional env if run on windows+conda
// {"platform": "win32", "environment_type": "conda", "python": "2.7", "req": {"libpython": ""}},
// ],
// The directory (relative to the current directory) that benchmarks are
// stored in. If not provided, defaults to "benchmarks"
"benchmark_dir": "benchmarks",
// The directory (relative to the current directory) to cache the Python
// environments in. If not provided, defaults to "env"
"env_dir": ".asv/env",
// The directory (relative to the current directory) that raw benchmark
// results are stored in. If not provided, defaults to "results".
"results_dir": "results",
// The directory (relative to the current directory) that the html tree
// should be written to. If not provided, defaults to "html".
"html_dir": "html",
// The number of characters to retain in the commit hashes.
// "hash_length": 8,
// `asv` will cache results of the recent builds in each
// environment, making them faster to install next time. This is
// the number of builds to keep, per environment.
"build_cache_size": 2,
// The commits after which the regression search in `asv publish`
// should start looking for regressions. Dictionary whose keys are
// regexps matching to benchmark names, and values corresponding to
// the commit (exclusive) after which to start looking for
// regressions. The default is to start from the first commit
// with results. If the commit is `null`, regression detection is
// skipped for the matching benchmark.
//
// "regressions_first_commits": {
// "some_benchmark": "352cdf", // Consider regressions only after this commit
// "another_benchmark": null, // Skip regression detection altogether
// },
// The thresholds for relative change in results, after which `asv
// publish` starts reporting regressions. Dictionary of the same
// form as in ``regressions_first_commits``, with values
// indicating the thresholds. If multiple entries match, the
// maximum is taken. If no entry matches, the default is 5%.
//
// "regressions_thresholds": {
// "some_benchmark": 0.01, // Threshold of 1%
// "another_benchmark": 0.5, // Threshold of 50%
// },
}
Loading
Loading