Revamp color metadata #62

mtbc · 2020-08-17T08:14:30Z

Split masks

For split masks the microservice specifies "dtype":"|b1" and the Zarr spec defines a Boolean data type as,

"b": Boolean (integer type where all values are only True or False)

and describes how metadata is encoded using JSON. However, in JSON, Boolean types are all-lowercase, no initial capital letter as above.

The microservice takes the JSON Boolean idea in offering, e.g., "fill_value":false and "color":{"true":-2139062144}. However, it's possible that it's meant to do a more-Pythonic initial-capital thing, or just use integers? (With 0 as false and not-0 as true? #60 (comment) suggests 1.)

Such changes are trivial, it's just very unclear. If anybody arrives at certainty, happy to open a fixing PR.

The text was updated successfully, but these errors were encountered:

mtbc · 2020-08-17T08:20:09Z

(Basically, from the point of view of developing in Java, it's generally not fun to try to interpret the Zarr v2 spec. 😕 Compressors are another thorny bit.)

manics · 2020-08-17T15:33:17Z

I'd have thought we'd want to stick to using JSON types where possible, so native booleans rather than strings representing booleans.
What does the "true" in "color":{"true":-2139062144} mean?

mtbc · 2020-08-17T15:40:42Z

I was surprised by that too. I'm adapting it from https://github.com/ome/omero-ms-zarr/blob/6add6804/spec.md#user-content-color, don't know why it needs stringifying though.

manics · 2020-08-17T15:44:39Z

Is it an opaque string, and the fact it happens to be a number is irrelevant? i.e. each object has a string label that is associated with a color?

mtbc · 2020-08-17T15:46:07Z

I think the image's pixel values have to fit the dtype as at https://zarr.readthedocs.io/en/stable/spec/v2.html#data-type-encoding.

manics · 2020-08-17T15:49:00Z

Must be because JSON only allows string keys

manics · 2020-08-17T15:50:49Z

Though I'd argue true/false should be represented as 1/0.

Perhaps this should be an array where array-index = object id, and if an object is missing it's set to null ?

mtbc · 2020-08-17T16:01:01Z

Aha, that'll be it, thank you. Could be we decide to opt for 1/0 for split masks, it's an easy change to make to the microservice. I wonder if that violates what Zarr readers expect for |b1 re. fill value, etc.? And all-bits-set is nice because it's more obvious in napari, I haven't figured the buttons to make it have 1/0 perceptible, sigh. Could make the not-0 value configurable.

joshmoore · 2020-08-19T06:52:20Z

cf:

@manics: can you outline the trade-offs between the two representations. The first proposal (#65) is to have the colors in an array the size of the labels, right?

  "color": [
    null,
    8388736,
    ...
  ]

with the downside that choosing large label values leads to large arrays. (One option would be to put this array in zarr rather than json?)

The other proposal (#68) is basically to make the color metadata more explicit, right?

  "color": [
    {
      "label": 1,
      "rgba": 8388736
    },
    {
      "label": 2,
      "rgba": 10101010
    },

and then allowing to potentially update the key in the object to be something other than "rgba", e.g. "hex" or whatever?

manics · 2020-08-19T08:14:23Z

Yes, pretty much what you said. The other benefit of the second option is it's extensible to include other metadata about a label, for instance you could add a name/description to each label. In this case it might be better to rename color to meta or metadata.

I think we should also change the current default colour model from "an int in whatever format OMERO uses" to a more understandable 4-tuple: [0-255, 0-255, 0-255, 0-255], or a CSS style hex value aabbccdd if you're concerned about the size of the field.

manics · 2020-08-19T09:58:02Z

Decisions from todays Zoom call:

Agreed:

Default to storing colour as an RGBA unsigned-int 4-tuple: [uint8, uint8, uint8, uint8], this does not preclude supporting other formats in future

Rejected:

The non-sparse array format

To be decided:

Keep the existing layout:

"color": {
  "1": [255, 255, 255, 0],
  "4": [0, 255, 255, 128],
  ...
}

Other metadata will be stored in a different top-level key, but with the same structure.

Sparse list of dicts:

"label-metadata": [
  {
    "label-value": 1,
    "rgba": [255, 255, 255, 0]
  },
  {
    "label-value": 4,
    "rgba": [0, 255, 255, 128]
  },
  ...
]

Sparse dict of dicts where key is the label-value converted to a string:

"label-metadata": {
  "1": {
    "rgba": [255, 255, 255, 0]
  },
  "4": {
    "rgba": [0, 255, 255, 128]
  },
  ...
}

will-moore · 2020-08-19T10:29:06Z

I prefer the first of the 3 options (most concise), but I guess the complexity of the labels parsing we're doing now for ome-zarr-py won't be very much different for any of these alternatives:

https://github.com/ome/ome-zarr-py/blob/208952c770c5d3683a8a8339173c0bc6ec158e54/ome_zarr.py#L276-L284

manics · 2020-08-19T10:35:32Z

Just noticed a bug in https://github.com/ome/ome-zarr-py/blob/208952c770c5d3683a8a8339173c0bc6ec158e54/ome_zarr.py#L280-L281 bool(non-empty-string) is always True, so bool("false") == True

joshmoore · 2020-08-20T08:09:37Z

Fixed in ome/ome-zarr-py@db43af5

k-dominik · 2020-08-20T09:11:50Z

If I may add my two cents.

sparse list of dicts: it is very explicit, support for a different color model is trivial, label names can be different types than strings. But the question is whether this freedom is desired? If not, than the current layout is probably fine. What I like about the other two approaches is that they are more human readable. I understand them without reading the specs. Wouldn't be so sure about the current layout.

joshmoore · 2020-09-01T07:34:55Z

Currently working on a set of PRs to test out the following

 24 ### "image-label" metadata
 23
 22 Groups containing the `image-label` dictionary represent an image segmentation
 21 in which each unique pixel value represents a separate segmented object.
 20 `image-label` groups MUST also contain `multiscales` metadata and the two
 19 "datasets" series MUST have the same number of entries.
 18
 17 The `colors` key defines a list of JSON objects describing the unique label
 16 values. Each entry in the list MUST contain the key "label-value" with the
 15 pixel value for that label. Additionally, the "rgba" key MAY be present, the
 14 value for which is an RGBA unsigned-int 4-tuple: `[uint8, uint8, uint8, uint8]`
 13 In the case that the same `label-value` is present multiple times in the list,
 12 the last value wins.
 11
 10 Some implementations may represent overlapping labels by using a specially assigned
  9 value, for example the highest integer available in the pixel range.
  8
  7 The `source` key is an optional dictionary which contains information on the
  6 image the label is associated with. If included it MAY include a key `image`
  5 whose value is the relative path to a Zarr image group. The default value is
  4 "../../" since most labels are stored under a subgroup named "labels/" (see
  3 above).
  2
  1
161 ```
  1 "image-label":
  2   {
  3     "version": "0.1",
  4     "colors": [
  5       {
  6         "label-value": 1,
  7         "rgba": [255, 255, 255, 0]
  8       },
  9       {
 10         "label-value": 4,
 11         "rgba": [0, 255, 255, 128]
 12       },
 13       ...
 14       ]
 15     },
 16     "source": {
 17       "image": "../../"
 18     }
 19 ]
 20 ```

joshmoore · 2020-09-01T11:45:20Z

Reading through https://qupath.readthedocs.io/en/latest/docs/advanced/exporting_annotations.html#binary-labeled-images, one wonders if the label-values need to be per channel.

joshmoore · 2020-09-07T06:27:09Z

image-label proposal is now opened in:

Separately, also considering an array (rather than JSON metadata) similar to this example:

import numpy as np
import zarr

t = [
    ("label-value", int),
    ("r", int),
    ("g", int),
    ("b", int),
    ("a", int),
    ("object-type", "U20"),
    ("object-id", int),
    ("description", "U200")]

data = list()
for x in range(100):
    data.append((1, 100, 100, 100, 100, "Mask", 123456, "some text here"))
    data.append((2, 200, 200, 200, 200, "Mask", 567896, "some more text"))
a = np.array(data, dtype=t)

z = zarr.open("s.zarr")
z.array(name="a", data=a, chunks=(10,))

imagesc-bot · 2020-10-01T12:01:16Z

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/multi-scale-image-labels-v0-1/43483/1

This was referenced Aug 18, 2020

Labels: change colors to array ome/omero-cli-zarr#23

Closed

Mask color array #65

Closed

Labels: change colors to array of dict ome/omero-cli-zarr#24

Closed

joshmoore changed the title ~~Figure out color metadata for split masks~~ Revamp color metadata Aug 19, 2020

joshmoore mentioned this issue Aug 19, 2020

Labeled multiscales #70

Merged

sbesson mentioned this issue Sep 8, 2020

Write up new image-label proposal #71

Merged

joshmoore mentioned this issue Nov 25, 2020

Multi-scaled image labels ome/ngff#14

Closed

sbesson mentioned this issue Mar 1, 2022

More clarifications of the "image-label" spec ome/ngff#105

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revamp color metadata #62

Revamp color metadata #62

mtbc commented Aug 17, 2020 •

edited by joshmoore

Loading

mtbc commented Aug 17, 2020

manics commented Aug 17, 2020

mtbc commented Aug 17, 2020

manics commented Aug 17, 2020

mtbc commented Aug 17, 2020

manics commented Aug 17, 2020

manics commented Aug 17, 2020 •

edited

Loading

mtbc commented Aug 17, 2020

joshmoore commented Aug 19, 2020

manics commented Aug 19, 2020 •

edited

Loading

manics commented Aug 19, 2020 •

edited

Loading

will-moore commented Aug 19, 2020

manics commented Aug 19, 2020

joshmoore commented Aug 20, 2020

k-dominik commented Aug 20, 2020

joshmoore commented Sep 1, 2020

joshmoore commented Sep 1, 2020

joshmoore commented Sep 7, 2020

imagesc-bot commented Oct 1, 2020

Revamp color metadata #62

Revamp color metadata #62

Comments

mtbc commented Aug 17, 2020 • edited by joshmoore Loading

Split masks

mtbc commented Aug 17, 2020

manics commented Aug 17, 2020

mtbc commented Aug 17, 2020

manics commented Aug 17, 2020

mtbc commented Aug 17, 2020

manics commented Aug 17, 2020

manics commented Aug 17, 2020 • edited Loading

mtbc commented Aug 17, 2020

joshmoore commented Aug 19, 2020

manics commented Aug 19, 2020 • edited Loading

manics commented Aug 19, 2020 • edited Loading

Decisions from todays Zoom call:

Agreed:

Rejected:

To be decided:

Keep the existing layout:

Sparse list of dicts:

Sparse dict of dicts where key is the label-value converted to a string:

will-moore commented Aug 19, 2020

manics commented Aug 19, 2020

joshmoore commented Aug 20, 2020

k-dominik commented Aug 20, 2020

joshmoore commented Sep 1, 2020

joshmoore commented Sep 1, 2020

joshmoore commented Sep 7, 2020

imagesc-bot commented Oct 1, 2020

mtbc commented Aug 17, 2020 •

edited by joshmoore

Loading

manics commented Aug 17, 2020 •

edited

Loading

manics commented Aug 19, 2020 •

edited

Loading

manics commented Aug 19, 2020 •

edited

Loading