Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating README #71

Merged
merged 25 commits into from
Jun 21, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 102 additions & 21 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ conflicts when importing the Python module.
Usage
=====

The plugin is called from the command-line using the `omero` command::
The plugin is called from the command-line using the ``omero metadata`` command::

$ omero metadata <subcommand>

Expand Down Expand Up @@ -64,45 +64,90 @@ populate
--------

This command creates an ``OMERO.table`` (bulk annotation) from a ``CSV`` file and links
the table as a ``File Annotation`` to a parent container such as Screen, Plate, Project
the table as a ``File Annotation`` to a parent container such as Screen, Plate, Project,
Dataset or Image. It also attempts to convert Image, Well or ROI names from the ``CSV`` into
object IDs in the ``OMERO.table``.

The ``CSV`` file must be provided as local file with ``--file path/to/file.csv``.

If you wish to ensure that ``number`` columns are created for numerical data, this will
allow you to make numerical queries on the table.
Column Types are:
OMERO.tables have defined column types to specify the data-type such as ``double`` or ``long`` and special object-types of each column for storing OMERO object IDs such as ``ImageColumn`` or ``WellColumn``.

The default behaviour of the script is to automatically detect the column types from an input ``CSV``. This behaviour works as follows:

* Columns named with a supported object-type (e.g. ``plate``, ``well``, ``image``, ``dataset``, or ``roi``), with ``<object> id`` or ``<object> name`` will generate the corresponding column type in the OMERO.table. See table below for full list of supported column names.

============ ================= ==================== ====================================================================
Column Name Column type Detected Header Type Notes
============ ================= ==================== ====================================================================
Image ``ImageColumn`` ``image`` Accepts image IDs. Appends new 'Image Name' column with image names.
Image Name ``StringColumn`` ``s`` Accepts image names. Appends new 'Image' column with image IDs.
Image ID ``ImageColumn`` ``image`` Accepts image IDs. Appends new 'Image Name' column with image names.
Dataset ``DatasetColumn`` ``dataset`` Accepts dataset IDs.
Dataset Name ``StringColumn`` ``s`` Accepts dataset names.
Dataset ID ``DatasetColumn`` ``dataset`` Accepts dataset IDs.
Plate ``PlateColumn`` ``plate`` Accepts plate names. Adds new 'Plate' column with plate IDs.
Plate Name ``PlateColumn`` ``plate`` Accepts plate names. Adds new 'Plate' column with plate IDs.
Plate ID ``LongColumn`` ``l`` Accepts plate IDs.
Well ``WellColumn`` ``well`` Accepts well names. Adds new 'Well' column with well IDs.
Well Name ``WellColumn`` ``well`` Accepts well names. Adds new 'Well' column with well IDs.
Well ID ``LongColumn`` ``l`` Accepts well IDs.
ROI ``RoiColumn`` ``roi`` Accepts ROI IDs. Appends new 'ROI Name' column with ROI names.
ROI Name ``StringColumn`` ``s`` Accepts ROI names. Appends new 'ROI' column with ROI IDs.
ROI ID ``RoiColumn`` ``roi`` Accepts ROI IDs. Appends new 'ROI Name' column with ROI names.
============ ================= ==================== ====================================================================

Note: Column names are case insensitive. Space, no space, and underscore are all accepted as separators for column names (i.e. ``<object> name``/``<object> id```, ``<object>name``/``<object>id``, ``<object>_name``/``<object>_id`` are all accepted)

NB: Column names should not contain spaces if you want to be able to query by these columns.

* All other column types will be detected based on the column's data using the pandas library. See table below.

=============== ================= ====================
Column Name Column type Detected Header Type
=============== ================= ====================
Example String ``StringColumn`` ``s``
Example Long ``LongColumn`` ``l``
Example Float ``DoubleColumn`` ``d``
Example boolean ``BoolColumn`` ``b``
=============== ================= ====================


However, it is possible to manually define the header types, ignoring the automatic header detection, if a ``CSV`` with a ``# header`` row is passed. The ``# header`` row should be the first row of the CSV and defines columns according to the following list (see examples below):

- ``d``: ``DoubleColumn``, for floating point numbers
- ``l``: ``LongColumn``, for integer numbers
- ``s``: ``StringColumn``, for text
- ``b``: ``BoolColumn``, for true/false
- ``plate``, ``well``, ``image``, ``dataset``, ``roi`` to specify objects

These can be specified in the first row of a ``CSV`` with a ``# header`` tag (see examples below).
The ``# header`` row is optional. Default column type is ``String``.
Automatic header detection can also be ignored if using the ``--manual_headers`` flag. If the ``# header`` is not present and this flag is used, column types will default to ``String`` (unless the column names correspond to OMERO objects such as ``image`` or ``plate``).

NB: Column names should not contain spaces if you want to be able to query
by these columns.

Examples
^^^^^^^^^

The examples below will use the default automatic column types detection behaviour. It is possible to achieve the same results (or a different desired result) by manually adding a custom ``# header`` row at the top of the CSV.

**Project / Dataset**
^^^^^^^^^^^^^^^^^^^^^^

To add a table to a Project, the ``CSV`` file needs to specify ``Dataset Name``
To add a table to a Project, the ``CSV`` file needs to specify ``Dataset Name`` or ``Dataset ID``
and ``Image Name`` or ``Image ID``::

$ omero metadata populate Project:1 --file path/to/project.csv

Using ``Image Name`` and ``Dataset Name``:

project.csv::

# header s,s,d,l,s
Image Name,Dataset Name,ROI_Area,Channel_Index,Channel_Name
img-01.png,dataset01,0.0469,1,DAPI
img-02.png,dataset01,0.142,2,GFP
img-03.png,dataset01,0.093,3,TRITC
img-04.png,dataset01,0.429,4,Cy5


This will create an OMERO.table linked to the Project like this with
The previous example will create an OMERO.table linked to the Project as follows with
a new ``Image`` column with IDs:

========== ============ ======== ============= ============ =====
Expand All @@ -114,23 +159,52 @@ img-03.png dataset01 0.093 3 TRITC 36640
img-04.png dataset01 0.429 4 Cy5 36641
========== ============ ======== ============= ============ =====

If the target is a Dataset instead of a Project, the ``Dataset Name`` column is not needed.
Note: equivalent to adding ``# header s,s,d,l,s`` row to the top of the ``project.csv`` for manual definition.

Using ``Image ID`` and ``Dataset ID``:

project.csv::

image id,Dataset ID,ROI_Area,Channel_Index,Channel_Name
36638,101,0.0469,1,DAPI
36639,101,0.142,2,GFP
36640,101,0.093,3,TRITC
36641,101,0.429,4,Cy5


The previous example will create an OMERO.table linked to the Project as follows with
a new ``Image Name`` column with Names:

===== ======= ======== ============= ============ ==========
Image Dataset ROI_Area Channel_Index Channel_Name Image Name
===== ======= ======== ============= ============ ==========
36638 101 0.0469 1 DAPI img-01.png
36639 101 0.142 2 GFP img-02.png
36640 101 0.093 3 TRITC img-03.png
36641 101 0.429 4 Cy5 img-04.png
===== ======= ======== ============= ============ ==========

Note: equivalent to adding ``# header image,dataset,d,l,s`` row to the top of the ``project.csv`` for manual definition.

For both examples above, alternatively, if the target is a Dataset instead of a Project, the ``Dataset`` or ``Dataset Name`` column is not needed.

**Screen / Plate**
^^^^^^^^^^^^^^^^^^^

To add a table to a Screen, the ``CSV`` file needs to specify ``Plate`` name and ``Well``.
If a ``# header`` is specified, column types must be ``well`` and ``plate``.
If a ``# header`` is specified, column types must be ``well`` and ``plate``::

$ omero metadata populate Screen:1 --file path/to/screen.csv

screen.csv::

# header well,plate,s,d,l,d
Well,Plate,Drug,Concentration,Cell_Count,Percent_Mitotic
A1,plate01,DMSO,10.1,10,25.4
A2,plate01,DMSO,0.1,1000,2.54
A3,plate01,DMSO,5.5,550,4
B1,plate01,DrugX,12.3,50,44.43


This will create an OMERO.table linked to the Screen, with the
``Well Name`` and ``Plate Name`` columns added and the ``Well`` and
``Plate`` columns used for IDs:
Expand All @@ -146,29 +220,30 @@ Well Plate Drug Concentration Cell_Count Percent_Mitotic Well Name Plat

If the target is a Plate instead of a Screen, the ``Plate`` column is not needed.

Note: equivalent to adding ``# header well,plate,s,d,l,d`` row to the top of the ``screen.csv`` for manual definition.

**ROIs**
^^^^^^^^^

If the target is an Image or a Dataset, a ``CSV`` with ROI-level or Shape-level data can be used to create an
``OMERO.table`` (bulk annotation) as a ``File Annotation`` linked to the target object.
If there is an ``roi`` column (header type ``roi``) containing ROI IDs, an ``Roi Name``
column will be appended automatically (see example below). If a column of Shape IDs named ``shape``
of type ``l`` is included, the Shape IDs will be validated (and set to -1 if invalid).
Also if an ``image`` column of Image IDs is included, an ``Image Name`` column will be added.
NB: Columns of type ``shape`` aren't yet supported on the OMERO.server.
NB: Columns of type ``shape`` aren't yet supported on the OMERO.server::

Alternatively, if the target is an Image, the ROI input column can be
``Roi Name`` (with type ``s``), and an ``roi`` type column will be appended containing ROI IDs.
In this case, it is required that ROIs on the Image in OMERO have the ``Name`` attribute set.
$ omero metadata populate Image:1 --file path/to/image.csv

image.csv::

# header roi,l,l,d,l
Roi,shape,object,probability,area
501,1066,1,0.8,250
502,1067,2,0.9,500
503,1068,3,0.2,25
503,1069,4,0.8,400
503,1070,5,0.5,200


This will create an OMERO.table linked to the Image like this:

Expand All @@ -182,6 +257,12 @@ Roi shape object probability area Roi Name
503 1070 5 0.5 200 Sample3
=== ===== ====== =========== ==== ========

Note: equivalent to adding ``# header roi,l,l,d,l`` row to the top of the ``image.csv`` for manual definition.

Alternatively, if the target is an Image, the ROI input column can be
``Roi Name`` (with type ``s``), and an ``roi`` type column will be appended containing ROI IDs.
In this case, it is required that ROIs on the Image in OMERO have the ``Name`` attribute set.

Note that the ROI-level data from an ``OMERO.table`` is not visible
in the OMERO.web UI right-hand panel under the ``Tables`` tab,
but the table can be visualized by clicking the "eye" on the bulk annotation attachment on the Image.
Expand All @@ -204,4 +285,4 @@ licensed under the terms of the GNU General Public License (GPL) v2 or later.
Copyright
---------

2018-2021, The Open Microscopy Environment
2018-2022, The Open Microscopy Environment and Glencoe Software, Inc