Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating README #71

Merged
merged 25 commits into from
Jun 21, 2022
Merged
Changes from 17 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 96 additions & 14 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ conflicts when importing the Python module.
Usage
=====

The plugin is called from the command-line using the `omero` command::
The plugin is called from the command-line using the ``omero metadata`` command::

$ omero metadata <subcommand>

Expand Down Expand Up @@ -64,45 +64,91 @@ populate
--------

This command creates an ``OMERO.table`` (bulk annotation) from a ``CSV`` file and links
the table as a ``File Annotation`` to a parent container such as Screen, Plate, Project
the table as a ``File Annotation`` to a parent container such as Screen, Plate, Project,
Dataset or Image. It also attempts to convert Image, Well or ROI names from the ``CSV`` into
object IDs in the ``OMERO.table``.

The ``CSV`` file must be provided as local file with ``--file path/to/file.csv``.

If you wish to ensure that ``number`` columns are created for numerical data, this will
allow you to make numerical queries on the table.
Column Types are:
OMERO.tables have defined column types to specify the data-type such as ``double`` or ``long`` and special object-types of each column for storing OMERO object IDs such as ``ImageColumn`` or ``WellColumn``
sbesson marked this conversation as resolved.
Show resolved Hide resolved

The default behaviour of the script is to automatically detect the column types from an input ``CSV``. This behaviour works as follows:

* Columns named with a supported object-type (e.g. ``plate``, ``well``, ``image``, ``dataset``, or ``roi``), with ``<object> id`` or ``<object> name`` will generate the corresponding column type in the OMERO.table. See table below for full list of supported column names.

============ ================= ==================== ==================================
Column Name Column type Detected Header Type Notes
============ ================= ==================== ==================================
Image ``ImageColumn`` ``image`` Appends 'Image Name' column
Image Name ``StringColumn`` ``s`` Appends 'Image' column
muhanadz marked this conversation as resolved.
Show resolved Hide resolved
Image ID ``ImageColumn`` ``image`` Appends 'Image Name' column
Dataset ``DatasetColumn`` ``dataset`` \-
Dataset Name ``StringColumn`` ``s`` \-
Dataset ID ``DatasetColumn`` ``dataset`` \-
Plate ``PlateColumn`` ``plate`` Adds 'Plate' column
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be Adds a 'Plate Name' column (from the example below)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plate column assumes the data is plate names, not plate ids. Since plate names are provided, omero metadata adds a new plate column for plate IDs. The confusion comes from the way it adds the column, which is replaced then appends whatever was replaced to the end of the table. Thus technically only added the plate ID column since the original was the plate name. Can attempt to reword the table to what type of column is added (ie names or ids)

Copy link
Member

@sbesson sbesson May 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit torn between 1- it's important for ourselves to capture knowledge about the current behavior (and eventually to fix it) and 2- users probably do not care :)

If rewording ends up being too tricky, I could imagine using the current wording and adding a footnote

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbesson @will-moore Coming back to this, I would add a footnote but there are already multiple notes under the table.
After much brainstorming, I gave it a good attempt in adding more information to the table (in 04eda4e) while capturing knowledge of the current behavior, staying minimalistic, and hopefully improving clarity for the reader/user. Feedback welcome on this change (as I've been staring at it for too long) and happy to revert if you think this is worse or more confusing.

Plate Name ``PlateColumn`` ``plate`` Adds 'Plate' column
Plate ID ``LongColumn`` ``l`` \-
muhanadz marked this conversation as resolved.
Show resolved Hide resolved
Well ``WellColumn`` ``well`` Adds 'Well' column
Well Name ``WellColumn`` ``well`` Adds 'Well' column
Well ID ``LongColumn`` ``l`` \-
ROI ``RoiColumn`` ``roi`` Appends 'ROI Name' column
ROI Name ``StringColumn`` ``s`` \-
muhanadz marked this conversation as resolved.
Show resolved Hide resolved
ROI ID ``RoiColumn`` ``roi`` Appends 'ROI Name' column
============ ================= ==================== ==================================

Note: Column names are case insensitive. Space, nospace, and underscore are all accepted as seperaters for column names (i.e. ``<object> name``/``<object> id```, ``<object>name``/``<object>id``, ``<object>_name``/``<object>_id`` are all accepted)
muhanadz marked this conversation as resolved.
Show resolved Hide resolved

* All other column types will be detected based on the column's data using the pandas library. See table below.

=============== ================= ====================
Column Name Column type Detected Header Type
=============== ================= ====================
Example String ``StringColumn`` ``s``
Example Long ``LongColumn`` ``l``
Example Float ``DoubleColumn`` ``d``
Example boolean ``BoolColumn`` ``b``
=============== ================= ====================


However, it is possible to manually define the header types, ignoring the automatic header detection, if a ``CSV`` with a ``# header`` row is passed. The ``# header`` row should be the first row of the CSV and defines columns according to the following list (see examples below):

- ``d``: ``DoubleColumn``, for floating point numbers
- ``l``: ``LongColumn``, for integer numbers
- ``s``: ``StringColumn``, for text
- ``b``: ``BoolColumn``, for true/false
- ``plate``, ``well``, ``image``, ``dataset``, ``roi`` to specify objects

These can be specified in the first row of a ``CSV`` with a ``# header`` tag (see examples below).
The ``# header`` row is optional. Default column type is ``String``.
Automatic header detection can also be ignored if using the ``--manual_headers`` flag. If the ``# header`` is not present and this flag is used, column types will default to ``String`` (unless the column names correspond to OMERO objects such as ``image`` or ``plate``).

NB: Column names should not contain spaces if you want to be able to query
muhanadz marked this conversation as resolved.
Show resolved Hide resolved
by these columns.


Examples
^^^^^^^^^

The examples below will use the default automatic column types detection behaviour. It is possible to achieve the same results (or a different desired result) by manually adding a custom ``# header`` row at the top of the CSV.

**Project / Dataset**
^^^^^^^^^^^^^^^^^^^^^^

To add a table to a Project, the ``CSV`` file needs to specify ``Dataset Name``
To add a table to a Project, the ``CSV`` file needs to specify ``Dataset Name`` or ``Dataset ID``
and ``Image Name`` or ``Image ID``::

$ omero metadata populate Project:1 --file path/to/project.csv

Using ``Image Name`` and ``Dataset Name``:

project.csv::

# header s,s,d,l,s
Image Name,Dataset Name,ROI_Area,Channel_Index,Channel_Name
img-01.png,dataset01,0.0469,1,DAPI
img-02.png,dataset01,0.142,2,GFP
img-03.png,dataset01,0.093,3,TRITC
img-04.png,dataset01,0.429,4,Cy5


This will create an OMERO.table linked to the Project like this with
The previous example will create an OMERO.table linked to the Project as follows with
a new ``Image`` column with IDs:

========== ============ ======== ============= ============ =====
Expand All @@ -114,23 +160,52 @@ img-03.png dataset01 0.093 3 TRITC 36640
img-04.png dataset01 0.429 4 Cy5 36641
========== ============ ======== ============= ============ =====

Note: equivalent to adding ``# header s,s,d,l,s`` row to the top of the ``project.csv`` for manual definition.

Using ``Image ID`` and ``Dataset ID``:

project.csv::

image id,Dataset ID,ROI_Area,Channel_Index,Channel_Name
36638,101,0.0469,1,DAPI
36639,101,0.142,2,GFP
36640,101,0.093,3,TRITC
36641,101,0.429,4,Cy5


The previous example will create an OMERO.table linked to the Project as follows with
a new ``Image`` column with Names:
muhanadz marked this conversation as resolved.
Show resolved Hide resolved
muhanadz marked this conversation as resolved.
Show resolved Hide resolved

===== ======= ======== ============= ============ ==========
Image Dataset ROI_Area Channel_Index Channel_Name Image Name
===== ======= ======== ============= ============ ==========
36638 101 0.0469 1 DAPI img-01.png
36639 101 0.142 2 GFP img-02.png
36640 101 0.093 3 TRITC img-03.png
36641 101 0.429 4 Cy5 img-04.png
===== ======= ======== ============= ============ ==========

If the target is a Dataset instead of a Project, the ``Dataset Name`` column is not needed.
muhanadz marked this conversation as resolved.
Show resolved Hide resolved

Note: equivalent to adding ``# header image,dataset,d,l,s`` row to the top of the ``project.csv`` for manual definition.

**Screen / Plate**
^^^^^^^^^^^^^^^^^^^

To add a table to a Screen, the ``CSV`` file needs to specify ``Plate`` name and ``Well``.
If a ``# header`` is specified, column types must be ``well`` and ``plate``.
If a ``# header`` is specified, column types must be ``well`` and ``plate``::

$ omero metadata populate Screen:1 --file path/to/screen.csv

screen.csv::

# header well,plate,s,d,l,d
Well,Plate,Drug,Concentration,Cell_Count,Percent_Mitotic
A1,plate01,DMSO,10.1,10,25.4
A2,plate01,DMSO,0.1,1000,2.54
A3,plate01,DMSO,5.5,550,4
B1,plate01,DrugX,12.3,50,44.43


This will create an OMERO.table linked to the Screen, with the
``Well Name`` and ``Plate Name`` columns added and the ``Well`` and
``Plate`` columns used for IDs:
Expand All @@ -146,7 +221,10 @@ Well Plate Drug Concentration Cell_Count Percent_Mitotic Well Name Plat

If the target is a Plate instead of a Screen, the ``Plate`` column is not needed.

Note: equivalent to adding ``# header well,plate,s,d,l,d`` row to the top of the ``screen.csv`` for manual definition.

**ROIs**
^^^^^^^^^

If the target is an Image or a Dataset, a ``CSV`` with ROI-level or Shape-level data can be used to create an
``OMERO.table`` (bulk annotation) as a ``File Annotation`` linked to the target object.
Expand All @@ -158,17 +236,19 @@ NB: Columns of type ``shape`` aren't yet supported on the OMERO.server.

Alternatively, if the target is an Image, the ROI input column can be
``Roi Name`` (with type ``s``), and an ``roi`` type column will be appended containing ROI IDs.
In this case, it is required that ROIs on the Image in OMERO have the ``Name`` attribute set.
In this case, it is required that ROIs on the Image in OMERO have the ``Name`` attribute set::
muhanadz marked this conversation as resolved.
Show resolved Hide resolved

$ omero metadata populate Image:1 --file path/to/image.csv

image.csv::

# header roi,l,l,d,l
Roi,shape,object,probability,area
501,1066,1,0.8,250
502,1067,2,0.9,500
503,1068,3,0.2,25
503,1069,4,0.8,400
503,1070,5,0.5,200


This will create an OMERO.table linked to the Image like this:

Expand All @@ -182,6 +262,8 @@ Roi shape object probability area Roi Name
503 1070 5 0.5 200 Sample3
=== ===== ====== =========== ==== ========

Note: equivalent to adding ``# header roi,l,l,d,l`` row to the top of the ``image.csv`` for manual definition.

Note that the ROI-level data from an ``OMERO.table`` is not visible
in the OMERO.web UI right-hand panel under the ``Tables`` tab,
but the table can be visualized by clicking the "eye" on the bulk annotation attachment on the Image.
Expand Down