Skip to content

Latest commit

 

History

History
133 lines (97 loc) · 7.05 KB

README.md

File metadata and controls

133 lines (97 loc) · 7.05 KB

Real-time Visual Saliency by Division of Gaussians - Reference Implementation

Python - PEP8 Saliency Test

Tested using Python 3.8.2 and OpenCV 4.8.0

DOG-Saliency|DOG-Saliency                                                                             Exemplar Real-time Salient Object Detection Using DOG Saliency

Abstract:

"This paper introduces a novel method for deriving visual saliency maps in real-time without compromising the quality of the output. This is achieved by replacing the computationally expensive centre-surround filters with a simpler mathematical model named Division of Gaussians (DIVoG). The results are compared to five other approaches, demonstrating at least six times faster execution than the current state-of-the-art whilst maintaining high detection accuracy. Given the multitude of computer vision applications that make use of visual saliency algorithms such a reduction in computational complexity is essential for improving their real-time performance."

[Katramados, Breckon, In Proc. International Conference on Image Processing, IEEE, 2011]


Reference implementation:

This Saliency Map generator uses the Division of Gaussians (DIVoG / DoG) approach to produce real-time saliency maps. Put simply this algorithm performs the following three steps (as set out in the original DIVoG research paper):

  • Bottom-up construction of Gaussian pyramid
  • Top-down construction of Gaussian pyramid based on the output of Step 1
  • Element-by element division of the input image with the output of Step 2

This repository contains saliencyDoG.py which corresponds to the Division of Gaussians algortihm as defined in [Katramados / Breckon, 2011]. demo.py is simply an example of usage of the SaliencyDoG library (supported by camera_stream.py, providing an unbuffered video feed from a live camera input), which demonstrates saliencyDoG using either live or input video, and a live result. Each frame is processed sequentially, producing the real-time saliency map. test.py should be used to verify correct versions of libraries are installed, before using the library.

saliencyDoG.py contains class SaliencyDoG. An object for a salience mapper can be created (with specific options), and used on various images, e.g.

from saliencyDoG import SaliencyDoG
import cv2

img = cv2.imread('dog.png')
saliency_mapper = SaliencyDoG(pyramid_height=5, shift=5, ch_3=False,
                              low_pass_filter=False, multi_layer_map=False)
img_saliency_map = saliency_mapper.generate_saliency(img)

where parameters:

  • pyramid_height - n as defined in [Katramados / Breckon 2011] - default = 5
  • shift - k as defined in [Katramados / Breckon 2011] - default = 5
  • ch_3 - process colour image on every channel (approximetly 3x slower) - default = False
  • low_pass_filter - toggle low pass filter - default = False
  • multi_layer_map - the second version of the algortihm as defined in [Katramados / Breckon 2011] (significantly slower, with simmilar results) - default = False

The SaliencyDoG class makes use of the Transparent API (T-API), to make use of any possible hardware acceleration


Instructions to use:

To download and test the supplied code do:

$ git clone https://github.com/tobybreckon/DoG-saliency.git
$ cd DoG-saliency
$ python3.x -m pip install -r requirements.txt
$ pytest test.py

Ensure that all tests are passed before proceeding. If any tests fail, ensure you have installed the modules from requirements.txt and are using at least python 3.7.5 and OpenCv 4.2.0.

Subsequently run the following command to obtain real-time saliency output from a connected camera or video file specified on the command line:

$ python3.x demo.py [-h] [-c CAMERA_TO_USE] [-r RESCALE] [-fs] [-g] [-l] [-m] [video_file]

positional arguments:

  • video_file  specify optional video file

optional arguments:

  • -h  show help message and exit
  • -c CAMERA_TO_USE  specify camera to use (int) - default = 0
  • -r RESCALE  rescale image by this factor (float) - default = 1.0
  • -fs   optionally run in full screen mode
  • -g   optionally process frames as grayscale
  • -l   optionally apply a low_pass_filter to saliency map
  • -m   optionally use every pyramid layer in the production of the saliency map

During run-time, keyboard commands x will quit the program, f will toggle fullscreen, s will toggle between saliency mapping and the original input image frames and t will toggle the speed/fps info display.


Example video:

Examples

Video Example - click image above to play.


References:

If you are making use of this work in any way please reference the following articles in any report, publication, presentation, software release or any other associated materials:

Real-time Visual Saliency by Division of Gaussians (Katramados, Breckon), In Proc. International Conference on Image Processing, IEEE, 2011.

@InProceedings{katramados11salient,
  author    =    {Katramados, I. and Breckon, T.P.},
  title     = 	 {Real-time Visual Saliency by Division of Gaussians},
  booktitle = 	 {Proc. Int. Conf. on Image Processing},
  pages     = 	 {1741-1744},
  year      = 	 {2011},
  month     = 	 {September},
  publisher =    {IEEE},
  url       = 	 {https://breckon.org/toby/publications/papers/katramados11salient.pdf},
  doi       = 	 {10.1109/ICIP.2011.6115785},
}

Free for commercial use and non-commercial use (i.e. academic, non-for-profit and research) under the (very permissive) terms of the MIT free software LICENSE that must be adhered to.

The Division of Gaussians (DIVoG / DoG) saliency detection algorithm was filed as a patent (WIPO reference: WO2013034878A2 by Cranfield University, 2013/14) but this patent application has now lasped.

Acknowledgements:

Ryan Lail, this reference implementation of [Katramados / Breckon, 2011], Durham University, July 2020.