Skip to content

v0.6.17

Compare
Choose a tag to compare
@cmungall cmungall released this 01 Oct 15:57
· 7 commits to main since this release
adb26c6

Highlights

Fine-grained cache management for sqlite downloads

OAK makes use of the pystow framework for caching downloads - one of the main uses for this is to cache downloads of sqlite builds of each ontology. Previously it was left to the user to go into their cache directory and remove old stale files.

This release provides finer grained control, e.g. via the global --caching option on the command line

See

For full details

Credit: @gouttegd

Additional OWL graph projections

We now include support for SubClassOf-hasValue pattern, exemplied by OBI's relationship between sequencers and manufacturers.

obi relationships MiSeq

yields:

subject predicate object subject_label predicate_label object_label
OBI:0002003 rdfs:subClassOf OBI:0400103 MiSeq None DNA sequencer
OBI:0002003 OBI:0000304 OBI:0000759 MiSeq is_manufactured_by Illumina

We also now include a test ontology for graph projections that could form part of a general test suite outside of OAK.

The OAK guide now includes a section on graph projections:

https://incatools.github.io/ontology-access-kit/guide/relationships-and-graphs.html#further-notes-on-owl-and-graph-projection

New validate-subset command

The default metrics used for evaluation involve calculating the degree of overlap between members of the
subset. Subsets in general should partition the ontology into sets that overlap as little as possible.

Different overlap metrics can be plugged in, see the information-content methods for more details.

The simplest way to run this is to pass in a list of terms via a subset query

runoak -i po.db validate-subset p i,p  .in Tomato

You can also calculate IC scores for each term and pass them in via a file:

runoak -i amigo:NCBITaxon:9606 information-content -o human-ic.tsv

Then

runoak -i go.db validate-subset p i,p  .in goslim_generic --information-content-file human-ic.tsv

This command also understand the GO subset metadata format. You can use this as configuration for
validating multiple subsets:

runoak -i go.db validate-subset --config-yaml go_subsets_metadata.yaml -X "i^BFO:" -O yaml

The taxon field is used to validate each subset according to its appropriate context

What's Changed

Full Changelog: v0.6.16...v0.6.17