Skip to content

Commit

Permalink
Merge pull request #4 from Evizero/utils
Browse files Browse the repository at this point in the history
Utility operations for reshape, permutedims, and channel separation
  • Loading branch information
Evizero authored Jun 15, 2017
2 parents 4514d61 + 4037873 commit 1ae9e33
Show file tree
Hide file tree
Showing 12 changed files with 750 additions and 12 deletions.
12 changes: 10 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,14 @@

New operations:

- `CropRatio`: crop to the specified aspect ratio around the center.
- `CropRatio`: Crop to the specified aspect ratio around the center.

- `RCropRatio`: crop random window with the specified aspect ratio.
- `RCropRatio`: Crop random window with the specified aspect ratio.

- `SplitChannels`: Separate the color channels into a dedicated array dimension.

- `CombineChannels`: Collapse the first dimension into a specific colorant.

- `PermuteDims`: Reorganize the array dimensions into a specific order.

- `Reshape`: Change or reinterpret the shape of the array.
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -202,7 +202,7 @@ look at the corresponding section of the
[documentation](http://augmentorjl.readthedocs.io/en/latest/usersguide/operations.html).

| Category | Operation | Description
|--------------:|:--------------------|:-----------------------------------------------------
|--------------:|:--------------------|:-----------------------------------------------------------------
| *Mirroring:* | `FlipX` | Reverse the order of each pixel row.
| | `FlipY` | Reverse the order of each pixel column.
| *Rotating:* | `Rotate90` | Rotate upwards 90 degree.
Expand All @@ -220,6 +220,10 @@ look at the corresponding section of the
| | `CropSize` | Crop area around the center with specified size.
| | `CropRatio` | Crop to specified aspect ratio.
| | `RCropRatio` | Crop random window of specified aspect ratio.
| *Layout:* | `SplitChannels` | Separate the color channels into a dedicated array dimension.
| | `CombineChannels` | Collapse the first dimension into a specific colorant.
| | `PermuteDims` | Reorganize the array dimensions into a specific order.
| | `Reshape` | Change or reinterpret the shape of the array.
| *Utilities:* | `NoOp` | Identity function. Pass image along unchanged.
| | `CacheImage` | Buffer the current image into (preallocated) memory.
| | `Either` | Apply one of the given operations at random.
Expand Down
129 changes: 129 additions & 0 deletions docs/usersguide/operations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ functionality.
| Cropping | :class:`Crop` :class:`CropNative` :class:`CropSize` :class:`CropRatio` |
| | :class:`RCropRatio` |
+-----------------------+----------------------------------------------------------------------------+
| Information Layout | :class:`SplitChannels` :class:`CombineChannels` :class:`PermuteDims` |
| | :class:`Reshape` |
+-----------------------+----------------------------------------------------------------------------+
| Utility Operations | :class:`NoOp` :class:`CacheImage` :class:`Either` |
+-----------------------+----------------------------------------------------------------------------+

Expand Down Expand Up @@ -551,6 +554,132 @@ Resizing
| .. image:: https://raw.githubusercontent.com/JuliaML/FileStorage/master/Augmentor/testpattern_small.png | .. image:: https://raw.githubusercontent.com/JuliaML/FileStorage/master/Augmentor/operations/Resize.png |
+---------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------+

Information Layout
--------------------

It is not uncommon that machine learning frameworks require the
data in a specific form and layout. For example many deep
learning frameworks expect the colorchannel of the images to be
encoded in the third dimension of a 4-dimensional array.

Augmentor allows to convert from (and to) these different layouts
using special operations that are mainly useful in the beginning
or end of a augmentation pipeline.

Color Channels
********************

.. class:: SplitChannels

Separate the color channels of the given image into a
dedicated array dimension. This will effectively create a new
array dimension for the colors as the first dimension. In the
case of greyscale images a singleton dimension will be created

This operation is mainly useful at the end of a pipeline in
combination with :class:`PermuteDims` in order to prepare the
image for the training algorithm, which often requires the
color channels to be separate.

.. code-block:: jlcon
julia> op = SplitChannels()
Split colorant into its color channels
julia> img = testpattern()
300×400 Array{RGBA{N0f8},2}:
[...]
julia> augment(img, op))
4×300×400 Array{N0f8,3}:
[...]
.. class:: CombineChannels

Combines the first dimension of a given array into a colorant
of the specified type ``colortype``. A separate color channel
is also expected for Gray images.

The shape of the input image has to be appropriate for the
given ``colortype``, which also means that the separated color
channel has to be the first dimension of the array. Use
:class:`PermuteDims` and/or :class:`Reshape` if that is not
the case.

This operation is mainly useful at the beginning of the
pipline, if the colorchannels of the input images are
separated.

.. code-block:: jlcon
julia> op = CombineChannels(RGB)
Combine color channels into colorant RGB{Any}
julia> A = rand(3, 10, 10) # random 10x10 RGB image
3×10×10 Array{Float64,3}:
[...]
julia> augment(A, op)
10×10 Array{RGB{Float64},2}:
[...]
Array Shape
********************

.. class:: PermuteDims

Permute the dimensions of the given array with the predefined
permutation ``perm``. This operation is particularly useful if
the order of the dimensions needs to be different than the
default "julian" layout.

More concretely, Augmentor expects the given images to be in
vertical-major layout for which the colors are encoded in the
element type itself. Many deep learning frameworks however
require their input in a different order. For example it is
not untypical that the color channels are expected to be
encoded in the third dimension.

.. code-block:: jlcon
julia> op = PermuteDims(3,2,1)
Permute dimension order to (3,2,1)
julia> img = testpattern()
300×400 Array{RGBA{N0f8},2}:
[...]
julia> augment(img, PermuteDims(2,1))
400×300 Array{RGBA{N0f8},2}:
[...]
.. class:: Reshape

Reinterpret the shape of the given array of numbers or
colorants. This is useful for example to create singleton
dimensions that deep learning frameworks may need for
colorless images, or for converting an image to a feature
vector and vice versa.

Note that this operation has nothing to do with image
resizing, but instead is strictly concerned with changing
the shape of the array.

.. code-block:: jlcon
julia> Reshape(10,15)
Reshape array to 10×15
julia> op = Reshape(25)
Reshape array to 25-element vector
julia> A = rand(5,5)
5×5 Array{Float64,2}:
[...]
julia> augment(A, op)
25-element Array{Float64,1}:
[...]
Utility Operations
--------------------
Expand Down
9 changes: 9 additions & 0 deletions src/Augmentor.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
__precompile__()
module Augmentor

using ColorTypes
using ColorTypes: AbstractGray
using ImageCore
using ImageTransformations
using ImageFiltering
Expand All @@ -17,6 +19,11 @@ using Base.PermutedDimsArrays: PermutedDimsArray

export

SplitChannels,
CombineChannels,
PermuteDims,
Reshape,

Rotate90,
Rotate180,
Rotate270,
Expand Down Expand Up @@ -53,6 +60,8 @@ include("utils.jl")
include("types.jl")
include("operation.jl")

include("operations/channels.jl")

include("operations/noop.jl")
include("operations/cache.jl")
include("operations/rotation.jl")
Expand Down
2 changes: 1 addition & 1 deletion src/operations/cache.jl
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ immutable CacheImage <: ImageOperation end

applyeager(op::CacheImage, img::Array) = img
applyeager(op::CacheImage, img::OffsetArray) = img
applyeager(op::CacheImage, img) = copy(img)
applyeager(op::CacheImage, img) = copy(img) # FIXME: collect

function showconstruction(io::IO, op::CacheImage)
print(io, typeof(op).name.name, "()")
Expand Down
Loading

0 comments on commit 1ae9e33

Please sign in to comment.