Specifications for bioimage.io

This repository contains the specifications of the standard format defined by the bioimage.io community for the content (i.e., models, datasets and applications) in the bioimage.io website. Each item in the content is always described using a YAML 1.2 file named rdf.yaml or bioimageio.yaml. This rdf.yaml \ bioimageio.yaml--- along with the files referenced in it --- can be downloaded from or uploaded to the bioimage.io website and may be produced or consumed by bioimage.io-compatible consumers (e.g., image analysis software like ilastik).

These are the rules and format that bioimage.io-compatible resources must fulfill.

Note that the Python package PyYAML does not support YAML 1.2 . We therefore use and recommend ruyaml. For differences see https://ruamelyaml.readthedocs.io/en/latest/pyyaml.

Please also note that the best way to check whether your rdf.yaml file is bioimage.io-compliant is to call bioimageio.core.validate from the bioimageio.core Python package. The bioimageio.core Python package also provides the bioimageio command line interface (CLI) with the validate command:

bioimageio validate path/to/your/rdf.yaml

Format version overview

All bioimage.io description formats are defined as Pydantic models.

Type	Format Version	Documentation¹	Developer Documentation²
model	0.5 0.4	model 0.5 model 0.4	ModelDescr_v0_5 ModelDescr_v0_4
dataset	0.3 0.2	dataset 0.3 dataset 0.2	DatasetDescr_v0_3 DatasetDescr_v0_2
notebook	0.3 0.2	notebook 0.3 notebook 0.2	NotebookDescr_v0_3 NotebookDescr_v0_2
application	0.3 0.2	application 0.3 application 0.2	ApplicationDescr_v0_3 ApplicationDescr_v0_2
generic	0.3 0.2	-	GenericDescr_v0_3 GenericDescr_v0_2

JSON Schema

Simplified descriptions are available as JSON Schema (generated with Pydantic):

bioimageio.spec version	JSON Schema	documentation¹
latest	bioimageio_schema_latest.json	latest documentation
0.5	bioimageio_schema_v0-5.json	0.5 documentation

Note: bioimageio_schema_v0-5.json and bioimageio_schema_latest.json are identical, but bioimageio_schema_latest.json will eventually refer to the future bioimageio_schema_v0-6.json.

Flattened, interactive docs

A flattened view of the types used by the spec that also shows values constraints.

rendered

You can also generate these docs locally by running PYTHONPATH=./scripts python -m interactive_docs

Examples

We provide some examples for using rdf.yaml files to describe models, applications, notebooks and datasets, and an example notebook to programmatically access the models, applications, notebooks and datasets descriptions.

💁 Recommendations

Due to the limitations of storage services such as Zenodo, which does not support subfolders, it is recommended to place other files in the same directory level of the rdf.yaml file and try to avoid using subdirectories.
Use the bioimageio.core Python package to validate your rdf.yaml file.
bioimageio.spec keeps evolving. Try to use and upgrade to the most current format version!

⌨ bioimageio command-line interface (CLI)

The bioimageio CLI has moved entirely to bioimageio.core.

🖥 Installation

bioimageio.spec can be installed with either conda or pip. We recommend installing bioimageio.core instead to get access to the Python programmatic features available in the BioImage.IO community:

conda install -c conda-forge bioimageio.core

or

pip install -U bioimageio.core

Still, for a lighter package or just testing, you can install the bioimageio.spec package solely:

conda install -c conda-forge bioimageio.spec

or

pip install -U bioimageio.spec

🏞 Environment variables

TODO: link to settings in dev docs

🤝 How to contribute

♥ Contributors

Made with contrib.rocks.

🛈 Versioining scheme

To keep the bioimageio.spec Python package version in sync with the (model) description format version, bioimageio.spec is versioned as MAJOR.MINRO.PATCH.LIB, where MAJOR.MINRO.PATCH correspond to the latest model description format version implemented and LIB may be bumpbed for library changes that do not affect the format version. This change was introduced with bioimageio.spec 0.5.3.1.

Δ Changelog

bioimageio.spec Python package

bioimageio.spec 0.5.3.7 (to be released)

update conda environments (remove cpuonly from pytorch envs)

bioimageio.spec 0.5.3.6

fix URL validation (checking with actual http requests was erroneously skipped)

bioimageio.spec 0.5.3.5

fix loading tifffile in python 3.8 (pin tifffile)
use default tensorflow environments for Keras H5 weights

bioimageio.spec 0.5.3.4

support loading and saving from/to zipfile.ZipFile objects
fix bug when packaging with weights priority order (#638)
add conda_env module providing helper to create recommended conda environments for model descriptions
fix summary formatting
improve logged origin for logged messages
make the model.v0_5.ModelDescr.training_data field a left_to_right Union to avoid warnings
the deprecated version_number is no longer appended to the id, but instead set as version if no version is specified.

bioimageio.spec 0.5.3.3

expose progressbar to customize display of download progress
expose get_resource_package_content
prefer rdf.yaml over bioimageio.yaml (name bioimageio.yaml file rdf.yaml file when packaging, look for rdf.yaml first, etc.)
enforce: (generic 0.3/model 0.5 spec) documentation source file encoding has to be UTF-8.
bugfix: allow optional pre- and postprocessing to be missing in an RDF (before it required an empty dict).

bioimageio.spec 0.5.3.2

bugfix "reset known files if root changes" (#619)

bioimageio.spec 0.5.3.1

note: the versioning scheme was changed as our previous post releases include changes beyond what a post release should entail (only changing docstrings, etc). This was motivated by the desire to keep the library version in sync with the (model) format version to avoid confusion. To keep this relation, but avoid overbearing post releases a library version number is now added as the 4th part MAJOR.MINOR.PATCH.LIB_VERSION.

add load_model_description and load_dataset_description
add ensure_description_is_model and ensure_description_is_dataset
expose perform_io_checks and known_files from ValidationContext to load_description and load_description_and_validate_format_only

bioimageio.spec 0.5.3post4

fix pinning of pydantic

bioimageio.spec 0.5.3post3

update resolving of bioimage.io resource IDs

bioimageio.spec 0.5.3post2

fix SHA-256 value when resolving a RDF version from the bioimage.io collection that is not the latest

bioimageio.spec 0.5.3post1

bump patch version during loading for model 0.5.x
improve validation error formatting
validate URLs first with a head request, if forbidden, follow up with a get request that is streamed and if that is also forbidden a regular get request.
RelativePath.absolute() is now a method (not a property) analog to pathlib.Path

bioimageio.spec 0.5.3

remove collection description
update SPDX license list
update generic description to 0.3.1
update model description to 0.5.3
add timeout argument to all requests.get calls

bioimageio.spec 0.5.2post5

added more information to validation summary
deprioritize Path objects in the FileSource union

bioimageio.spec 0.5.2post4

resolve backup DOIs
fix resolving relative file paths given as strings
allow to bypass download and hashing of known files

bioimageio.spec 0.5.2post3

avoid full download when validating urls

bioimageio.spec 0.5.2post2

resolve version (un)specific collection IDs, e.g. load_description('affable-shark'), load_description('affable-shark/1')

bioimageio.spec 0.5.2post1

fix model packaging with weights format priority

bioimageio.spec 0.5.2

new patch version model 0.5.2

bioimageio.spec 0.5.1

new patch version model 0.5.1

bioimageio.spec 0.5.0post2

don't fail if CI env var is a string

bioimageio.spec 0.5.0post1

fix _internal.io_utils.identify_bioimageio_yaml_file()

bioimageio.spec 0.5.0

new description formats: generic 0.3, application 0.3, collection 0.3, dataset 0.3, notebook 0.3 and model 0.5.
various API changes, most important functions:
- bioimageio.spec.load_description (replaces load_raw_resource_description, interface changed)
- bioimageio.spec.validate_format (new)
- bioimageio.spec.dump_description (replaces serialize_raw_resource_description_to_dict, interface changed)
- bioimageio.spec.update_format (interface changed)
switch from Marshmallow to Pydantic
- extended validation
- one joint, more precise JSON Schema

bioimageio.spec 0.4.9

small bugixes
better type hints
improved tests

bioimageio.spec 0.4.8post1

add axes and eps to scale_mean_var

bioimageio.spec 0.4.7post1

add simple forward compatibility by treating future format versions as latest known (for the respective resource type)

bioimageio.spec 0.4.6post3

Make CLI output more readable
find redirected URLs when checking for URL availability

bioimageio.spec 0.4.6post2

Improve error message for non-existing RDF file path given as string
Improve documentation for model description's documentation field

bioimageio.spec 0.4.6post1

fix enrich_partial_rdf_with_imjoy_plugin (see #452)

bioimageio.spec 0.4.5post16

fix rdf_update of entries in resolve_collection_entries()

bioimageio.spec 0.4.5post15

pass root to enrich_partial_rdf arg of resolve_collection_entries()

bioimageio.spec 0.4.5post14

keep ResourceDescrption.root_path as URI for remote resources. This fixes the collection description as the collection entries are resolved after the collection description has been loaded.

bioimageio.spec 0.4.5post13

new bioimageio.spec.partner module adding validate-partner-collection command if optional 'lxml' dependency is available

bioimageio.spec 0.4.5post12

new env var BIOIMAGEIO_CACHE_WARNINGS_LIMIT (default: 3) to avoid spam from cache hit warnings
more robust conversion of ImportableSourceFile for absolute paths to relative paths (don't fail on non-path source file)

bioimageio.spec 0.4.5post11

resolve symlinks when transforming absolute to relative paths during serialization; see #438

bioimageio.spec 0.4.5post10

fix loading of collection description with id (id used to be ignored)

bioimageio.spec 0.4.5post9

support loading bioimageio resources by their animal nickname (currently only models have nicknames).

bioimageio.spec 0.4.5post8

any field previously expecting a local relative path is now also accepting an absolute path
load_raw_resource_description returns a raw resource description which has no relative paths (any relative paths are converted to absolute paths).

bioimageio.spec 0.4.4post7

add command commands.update_rdf()/update-rdf(cli)

bioimageio.spec 0.4.4post2

fix unresolved ImportableSourceFile

bioimageio.spec 0.4.4post1

fix collection description conversion for type field

bioimageio.spec 0.4.3post1

fix to shape validation for model description 0.4: output shape now needs to be bigger than halo
moved objects from bioimageio.spec.shared.utils to bioimageio.spec.shared[.node_transformer]
additional keys to validation summary: bioimageio_spec_version, status

bioimageio.spec 0.4.2post4

fixes to generic description:
- ignore value of field root_path if present in yaml. This field is used internally and always present in RDF nodes.

bioimageio.spec 0.4.1.post5

fixes to collection description:
- RDFs specified directly in collection description are validated correctly even if their source field does not point to an RDF.
- nesting of collection description allowed

bioimageio.spec 0.4.1.post4

fixed missing field icon in generic description's raw node
fixes to collection description:
- RDFs specified directly in collection description are validated correctly
- no nesting of collection description allowed for now
- links is no longer an explicit collection entry field ("moved" to unknown)

bioimageio.spec 0.4.1.post0

new model spec 0.3.5 and 0.4.1

bioimageio.spec 0.4.0.post3

load_raw_resource_description no longer accepts update_to_current_format kwarg (use update_to_format instead)

bioimageio.spec 0.4.0.post2

load_raw_resource_description accepts update_to_format kwarg

Resource Description Format Versions

general 0.3.1 and model 0.5.3

Non-breaking changes
- remove version_number in favor of using version

model 0.5.2

Non-breaking changes
- added concatenable flag to index, time and space input axes

model 0.5.1

Non-breaking changes
- added DataDependentSize for outputs.i.size to specify an output shape that is not known before inference is run.
- added optional inputs.i.optional field to indicate that a tensor may be None
- made data type assumptions in preprocessing and postprocessing explicit by adding 'ensure_dtype' operations per default.
- allow to specify multiple thresholds (along an axis) in a 'binarize' processing step

generic 0.3.0 / application 0.3.0 / collection 0.3.0 / dataset 0.3.0 / notebook 0.3.0

Breaking canges that are fully auto-convertible
- dropped download_url
- dropped non-file attachments
- attachments.files moved to attachments.i.source
Non-breaking changes
- added optional parent field

model 0.5.0

all generic 0.3.0 changes (except models already have the parent field) plus:

Breaking changes that are partially auto-convertible
- inputs.i.axes are now defined in more detail (same for outputs.i.axes)
- inputs.i.shape moved per axes to inputs.i.axes.size (same for outputs.i.shape)
- new pre-/postprocessing 'fixed_zero_mean_unit_variance' separated from 'zero_mean_unit_variance', where mode=fixed is no longer valid. (for scalar values this is auto-convertible.)
Breaking changes that are fully auto-convertible
- changes in weights.pytorch_state_dict.architecture
  - renamed weights.pytorch_state_dict.architecture.source_file to ...architecture.source
- changes in weights.pytorch_state_dict.dependencies
  - only conda environment allowed and specified by weights.pytorch_state_dict.dependencies.source
  - new optional field weights.pytorch_state_dict.dependencies.sha256
- changes in weights.tensorflow_model_bundle.dependencies
  - same as changes in weights.pytorch_state_dict.dependencies
- moved test_inputs to inputs.i.test_tensor
- moved test_outputs to outputs.i.test_tensor
- moved sample_inputs to inputs.i.sample_tensor
- moved sample_outputs to outputs.i.sample_tensor
- renamed inputs.i.name to inputs.i.id
- renamed outputs.i.name to outputs.i.id
- renamed inputs.i.preprocessing.name to inputs.i.preprocessing.id
- renamed outputs.i.postprocessing.name to outputs.i.postprocessing.id
Non-breaking changes:
- new pre-/postprocessing: id='ensure_dtype' with kwarg dtype

generic 0.2.4 and model 0.4.10

Breaking changes that are fully auto-convertible
- id overwritten with value from config.bioimageio.nickname if available
Non-breaking changes
- version_number is a new, optional field indicating that an RDF is the nth published version with a given id
- id_emoji is a new, optional field (set from config.bioimageio.nickname_icon if available)
- uploader is a new, optional field with email and an optional name subfields

model 0.4.9

Non-breaking changes
- make pre-/postprocessing kwargs mode and axes always optional for model description 0.3 and 0.4

model 0.4.8

Non-breaking changes
- cite field is now optional

generic 0.2.2 and model 0.4.7

Breaking changes that are fully auto-convertible
- name field may not include '/' or '' (conversion removes these)

model 0.4.6

Non-breaking changes
- Implicit output shape can be expanded by inserting null into shape:scale and indicating length of new dimension D in the offset field. Keep in mind that D=2*'offset'.

model 0.4.5

Breaking changes that are fully auto-convertible
- parent field changed to hold a string that is a bioimage.io ID, a URL or a local relative path (and not subfields uri and sha256)

model 0.4.4

Non-breaking changes
- new optional field training_data

dataset 0.2.2

Non-breaking changes
- explicitly define and document dataset description (for now, clone of generic description with type="dataset")

model 0.4.3

Non-breaking changes
- add optional field download_url
- add optional field dependencies to all weight formats (not only pytorch_state_dict)
- add optional pytorch_version to the pytorch_state_dict and torchscript weight formats

model 0.4.2

Bug fixes:
- in a pytorch_state_dict weight entry architecture is no longer optional.

collection 0.2.2

Non-breaking changes
- make authors, cite, documentation and tags optional
Breaking changes that are fully auto-convertible
- Simplifies collection description 0.2.1 by merging resource type fields together to a collection field, holindg a list of all resources in the specified collection.

generic 0.2.2 / model 0.3.6 / model 0.4.2

Non-breaking changes
- rdf_source new optional field
- id new optional field

collection 0.2.1

First official release, extends generic description with fields application, model, dataset, notebook and (nested) collection, which hold lists linking to respective resources.

generic 0.2.1

Non-breaking changes
- add optional email and github_user fields to entries in authors
- add optional maintainers field (entries like in authors but github_user is required (and name is not))

model 0.4.1

Breaking changes that are fully auto-convertible
- moved field dependencies to weights:pytorch_state_dict:dependencies
Non-breaking changes
- documentation field accepts URLs as well

model 0.3.5

Non-breaking changes
- documentation field accepts URLs as well

model 0.4.0

Breaking changes
- model inputs and outputs may not use duplicated names.
- model field sha256 is required if pytorch_state_dict weights are defined. and is now moved to the pytroch_state_dict entry as architecture_sha256.
Breaking changes that are fully auto-convertible
- model fields language and framework are removed.
- model field source is renamed architecture and is moved together with kwargs to the pytorch_state_dict weights entry (if it exists, otherwise they are removed).
- the weight format pytorch_script was renamed to torchscript.
Other changes
- model inputs (like outputs) may be defined by scaleing and offseting a reference_tensor
- a maintainers field was added to the model description.
- the entries in the authors field may now additionally contain email or github_user.
- the summary returned by the validate command now also contains a list of warnings.
- an update_format command was added to aid with updating older RDFs by applying auto-conversion.

model 0.3.4

Non-breaking changes
- Add optional parameter eps to scale_range postprocessing.

model 0.3.3

Breaking changes that are fully auto-convertible
- reference_input for implicit output tensor shape was renamed to reference_tensor

model 0.3.2

Breaking changes
- The RDF file name in a package should be rdf.yaml for all the RDF (not model.yaml);
- Change authors and packaged_by fields from List[str] to List[Author] with Author consisting of a dictionary {name: '<Full name>', affiliation: '<Affiliation>', orcid: 'optional orcid id'};
- Add a mandatory type field to comply with the generic description. Only valid value is 'model' for model description;
- Only allow license identifier from the SPDX license list;
Non-breaking changes
- Add optional version field (default 0.1.0) to keep track of model changes;
- Allow the values in the attachments list to be any values besides URI;

JSON Schema based documentation generated with json-schema-for-humans. ↩ ↩²
Part of the bioimageio.spec package documentation generated with pdoc. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 2,672 Commits
.github		.github
.vscode		.vscode
bioimageio/spec		bioimageio/spec
conda-recipe		conda-recipe
dev		dev
example		example
example_descriptions		example_descriptions
scripts		scripts
tests		tests
.gitignore		.gitignore
.markdownlint.json		.markdownlint.json
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py
weight_formats_spec.json		weight_formats_spec.json

License

bioimage-io/spec-bioimage-io

Folders and files

Latest commit

History

Repository files navigation