The case for moving client side groupings (search results, shopping carts, ad hoc groupings) to server side collections by using abstractions of file systems #42

rudolphpienaar · 2023-06-02T18:04:21Z

rudolphpienaar
Jun 2, 2023
Maintainer

Definitions

The "ChRIS" ecosystem consists of many components. For the purposes of this discussion ChRIS is taken to refer to the whole system, the ChRIS_ui only the client side user interface, and CUBE the core backend.

Proposition

CUBE works best with objects grouped into collections akin to UNIX filesystems -- i.e. files and directories. What if everything ChRIS does is fit into this CUBE-side metaphor? In other words, what if operations that are often / traditionally considered client side are shifted server side and leverage this world view?

What if everything in ChRIS (specifically things that are currently artifacts from the ChRIS UI) was supported by CUBE as server-side abstractions as files grouped into directories?

Qualification

I don't want to get caught in the practicalities of this in as much as are these in swift object storage, are these in the base filsystem that CUBE uses, how long does registration take, etc. I think @jennydaman made the observation that this can all be done within the CUBE database to be functionally/experientially equivalent to filesystem files/directories anyway.

Summary

CUBE (aka the ChRIS backend) is natively built to work with "file" objects. It in fact explicitly names objects using UNIX filesystem conventions, i.e. with full directory paths. "File" objects are thus naturally grouped into collections (like directories) just by the purposeful accident of their naming. CUBE provides several APIs to explore/interact with file objects based on a UNIX filesystem metaphor. Thus, "files" and operations on "files" lies at the heart of CUBE, and are what I'll call "first class" abstraction.

Internally, CUBE "stores" files objects in swift storage and then "registers" them to an internal database. Files only exist to CUBE if "registered" in its database.

CUBE does not consider ad-hoc groupings of files as first class. Ad hoc groupings of files are operations like searching, selecting, tagging, etc. These have been traditionally seen as client side and so CUBE somewhat artifically draws a distinction in how it returns/handles these. Instead of grouping these as collections of "files", CUBE returns ad-hoc operations as groupings of "descriptions" or "file names ". A search does not actually gather a bunch of files into a "directory", but it returns a list of names of objects. There is nothing wrong with this and is in fact quite logical.

Background

"Collections of things" in the ChRIS ecosystem can be created in one of two ways:

causal/formal -- the results of deterministic operations on fixed sets of inputs that result in predictable resultant collections -- traditionally handled server side. These always create new sets/data and have traditionally been the purvue of ChRIS plugins.
ad-hoc/informal -- the results of arbitrary operations on fixed sets of inputs that result in (possibly) unpredictable resultant collections -- traditionally handled client side. These always use existing sets/data and simply re-collect them in a new grouping. The one exception to this is the client side file upload that results in directories/files within CUBE.

At the heart of ChRIS, CUBE is a system primarily built for managing causal collections, where the deterministic operators are plugins that create new data. These are mathematical analogous of functions that consume input sets and produce output sets. CUBE is good at "doing things" with groupings of sets -- whatever functionalty a plugin can be coded to provide on an input set, CUBE will manage.

The ad-hoc groupings were historically considered somewhat arbitrary and secondary. These are handled client-side by elements such as the ChRIS_ui and include user-driven groupings -- say "searching", "gathering", etc. In fact, most client side groupings are really only one class of operation: organizing or filtering descriptors of existing data into new collections.

Such organizing of ad-hoc "collections of things" within the context of the ChRIS UI seems intuitively to be a client-side task. Consider something like a "file selection" picker (abstracted currently as a a "shopping cart" in the UI). Simillary, consider the results of a search operation across the object space. A search returns "hits". Since these collections are never formally defined by CUBE in a "directory", CUBE has to return an ephermeral "descriptor" of things, i.e. a list of things that consititute the group and the client needs to maintain/track this state.

Unintended consequences

The first unintended consequence is obvious from the above. The state of ad-hoc groupings must be maintained by the client. Do a page refresh? Your state is gone. Log out/log in? Gone is your state. Think about how to track this client side? Coookies? Client-side database? More complexity. More maintenance. More failures.

The other set of unintended is the observation that the space/context of ad-hoc groupings is potentially unbounded and "problem-at-hand" specific. Let's consider three obvious cases (there could be more, but I think these make the point):

Searching

A "search" can return lots of "hits" that the client must make sense of in some fashion. Once you have these hits "grouped" client side, what then? All operations on this group have to be coded specifically for this context. Can you download all these hits? No, you have to code client side handling. Can you start a new analysis on these hits? No, you have to code client side behaviour.

Sub-grouping, aka shopping cart

Now, consider gathering a sub-set selection from these search hits, resulting in another possibly different, set-specific grouping. Any client side behviour that was coded for the set of search hits, is not necessarily easily transferable to this new contextual sub grouping. This new sub set grouping of search is now called a "shopping cart" and all the operations of what can be done in that shopping cart has to be coded specifically for that cart. This means cart-specific "download". Cart-specific "start a new feed". Cart-specific "show images". Since the cart is a different "thing" than the search hits, more abstraction/duplication with client side handling code has be constructed. Can the cart be leveraged in other places of the UI? Not necessarily easily.

New feed creation

Similary, creating a new Feed. Again this is a client-side "gathering" operation with no leverage/assistance from the backend. The client constructs a new context, shows information, and a user creates a sub grouping that exists only in the client. This new/unique grouping is again "different" to a shopping cart, "different" to a search, and so again complicating/duplicating behaviour since code written to handle search-groupings, selection-groupings, etc is not applicable here.

What if these cases were shifted server side?

Let's now do a thought experiment and imagine if the above cases were handled server side. I'm going to use words like "files" and "directories" but these are not necessarily meant to imply actual objects in swift storage. They could simply be new entries in a DB table as if they were file objects in swift. The point is to "imagine" what it would mean to collect into a directory without getting too hung up right now on how that would happen.

Let's also imagine that we have well defined behaviors on CUBE "directories":

upload new files into a CUBE directory
delete files from a CUBE directory
download a CUBE directory
create a new feed from a CUBE directory (and possibly intermingle with an FS plugin)
various client-side viewers such as being able to intelligently render files (like DICOM, jpeg, etc)

(of course, some directories are "locked" -- typically those that result from DS plugins)

Searching

Performing a "search" is already supported by CUBE across several internal APIs. The result of these searches are returned to the client/caller in the form some "meta" description -- a list of string JSON objects, or a comma-separated string of hits (the details are unimportant). Imagine if instead, CUBE creates a new "directory", puts all the "hits" (i.e. files) in that directory, and then simply returns to the client that "directory" name? The client can now treat this as a directory and all client code that already exists to work with "files" in "directories" is immediately available.

As a rather useful bonus, that search result, i.e. that directory, now persists and is stateful independent of the client. Since it is a CUBE "directory", client side operations can add more files, delete files, etc etc. Don't want the search results anymore? Delete the directory.

Sub-grouping

This is again really just a specialized search on the space of search hits. So by adding functionality to CUBE where selections in a directory can be easily on the backend "copied" to a new directory, the problem is solved. This new directory is a directory like any other, so persists, can be revisted, new contents added, existing contents deleted, etc.

New feed creation

This could in fact become a built-in behavior on directories supported by the backend. A new feed is creation is currently the grouping of three progenitors:

Uploads from a user filesystem
Selection of files from existing ChRIS storage
Running an arbitrary FS plugin

(Importantly, the union of all the above has not yet been implemented in the current ChRIS_ui).

So let's imagine creating a new Feed from a server-side directory viewpoint:

First, a user can start by "selecting" existing files to collect into the server side directory. This can be zero or many.
Secondly, using the directory from the first step, the user can easily upload new files into this directory
(as an aside, since these are all server side directories, the client side code for deleting files just naturally comes into play)
finally, the user can "create" a new feed from this directory

As part of "creating" a new feed, it could be possible/simple to offer the selection of an FS plugin whose output is collected together with the directory the user just created. We have the machinery to do this already.

Other directory built-ins:

The other server side directory built-in is "downloading". The old UI was completely hamstrung by multiple different client side contexts each trying to download from CUBE. If CUBE has "built in" behaviours on directories, then download could be simply "create a new feed off this directory and attach the zip plugin".

Conclusion

I anticipate some resistance to this idea. I do hope/invite you to carefully examine your resistance (if you have some!) and ask yourself if that resistance is "gut reaction" to a "new way of thinking about things", or if you are not seeing the forest for the trees (like getting stuck on swift or filesystems etc).

I look forward to debate/critiques of this idea first. Technicalities can come downstream!

Thanks for reading

-30-

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The case for moving client side groupings (search results, shopping carts, ad hoc groupings) to server side collections by using abstractions of file systems #42

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

The case for moving client side groupings (search results, shopping carts, ad hoc groupings) to server side collections by using abstractions of file systems #42

rudolphpienaar Jun 2, 2023 Maintainer

Definitions

Proposition

Qualification

Summary

Background

Unintended consequences

Searching

Sub-grouping, aka shopping cart

New feed creation

What if these cases were shifted server side?

Searching

Sub-grouping

New feed creation

Other directory built-ins:

Conclusion

Replies: 0 comments

rudolphpienaar
Jun 2, 2023
Maintainer