Scivision architecture and scope? #168

ghost · 2022-02-16T22:14:54Z

ghost
Feb 16, 2022

Thanks for involving me in the co-working session today. Here is a bit of a ramble in case it helps with design deliberations:

The scope of Scivision still seems a bit vague to me. Is it a library? Is it an app? Is it a catalogue? Is it a gallery? Is it a model manager? etc.

Sometimes being clear about what's out of scope helps to focus on what's in scope, i.e. which of the following plugins will Scivision include and not include in the future?

Connectors to image data sources (files, urls, h5 stores)
Connectors to model zoos
Transformers that take an image and convert it into another image e.g. RGB to grayscale, resize, edge detector.
Pixel classifiers (e.g. vedge detector)
Image classifiers (e.g. plankton)
Object detectors (e.g. YOLO3)
Segmenters
Visualisers (e.g. histogram)
Evaluators (e.g. a confusion matrix)

We can't expect users to refactor their models, so an adapter design pattern with a very specific, well defined interface seems essential for plugins.

Languages such as Java and C# make interfaces explicit and may provide some design inspiration.

The design of the interface is difficult because interfaces are task specific and tasks are subtly different:

E.g. The plankton model is an image classifier that takes an image and gives a predicted class label. However the Vedge detector is a pixel classifier that takes an image and returns another image with pixels marked according to probability.

Other classifiers may include object localisation (E.g. YOLO3) where the image might include three cats and a dog. In which case the input is an image and the output is four bounding boxes.

So all work on images but none have consistent return types.

To add further complexity, images may have different resolutions, be monochrome or multiband. How will you enforce image type? e.g. Vedge may only work with 4 band images?

It's usually a mistake to have liberal interfaces such as def predict(*args):

We talked about object oriented versus functional programming style.

Having written and maintained some large object oriented systems (albeit not in Python) I am wary that heavily object oriented code often leads to accidental complexity and I am reminded of the following:

"I think the lack of reusability comes in object-oriented languages, not functional languages. Because the problem with object-oriented languages is they’ve got all this implicit environment that they carry around with them. You wanted a banana but what you got was a gorilla holding the banana and the entire jungle. If you have referentially transparent code, if you have pure functions - all the data comes in its input arguments and everything goes out and leave no state behind — it's incredibly reusable." - Joe Armstrong, creator of Erlang writing on software reusability in Coders at Work by Peter Seibel.

However, I have seen "object-based" code (no inheritance, immutable classes) work well in Python.

Somebody mentioned the singleton design pattern, but in my experience it can make code hard to test, doesn't allow object composition and I don't think it's helpful here unless there are specific resource management constraints like protecting access to a GPU.

I really like the catalogue idea, but it ought to be separately versioned to the code.

Some similar patterns that may provide inspiration are:

QGIS and plugins
Julia and Julia Packages
NixOS and the Nix store

These all keep the core code separate from the "catalogue" allowing wider and more liberal contributions to the catalogue with the code itself having a much smaller set of maintainers. They all have to deal with versioning constraints.

It follows that it should be up to contributors to host their adapters in their own repos and the catalogue entry is just a pointer.

Consider "convention over configuration" to make models quicker to plug in

The idea is that if the user doesn't specify any (or much) configuration and the system just works with sane defaults, then you can be up and running quickly.

E.g. if a user doesn't specify an entry point in JSON, perhaps default to "predict"?

Could you introspect plugins at runtime to work out entry points etc?

I think it would be instructive to consider more use cases before finalising the adapter design.

It's hard to design reusable code patterns until you've seen a fairly large number of repetitive cases and can understand the essence of the commonality.

I'd like a clearer idea of the value proposition for Scivision.

For Martin the attraction is discoverability - getting his model to a wider audience.

For Cefas the attraction is ease of use of models e.g. Can we deploy the plankton classifier so that a technician could use it to sort some images without writing any (or much) code?

What is Scivision's relationship to PIL, OpenCV, SciML etc?

If it proves too difficult to design a generic framework, perhaps Scivision could become a searchable gallery of example notebooks instead?

This could satisfy the requirement of discoverability and maintain maximum flexibility. It would avoid all the gnarly versioning issues and it seems to be your current focus?

acocac · 2022-02-17T11:20:42Z

acocac
Feb 17, 2022
Maintainer

Thanks Robert for your inputs. They're all very valuable to the early stage design and scope of scivision.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scivision architecture and scope? #168

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Scivision architecture and scope? #168

ghost Feb 16, 2022

Replies: 1 comment

acocac Feb 17, 2022 Maintainer

ghost
Feb 16, 2022

acocac
Feb 17, 2022
Maintainer