The goal of this toolbox is to simplify the process of feature extraction, of commonly used computer vision features such as HOG, SIFT, GIST and Color, for tasks related to image classification. The details of the included features are available in FEATURES.md.
In addition to providing some of the popular features, the toolbox has been designed for use with the ever increasing size of modern datasets - the processing is done in batches and is fully parallelized on a single machine (using parfor), and can be easily distributed across multiple machines with a common file system (the standard cluster setup in many universities).
The features extracted in a bag-of-words manner ('color', 'hog2x2', 'hog3x3', 'sift', 'ssim') are encoded using Locality-Constrained Linear Coding to allow the use of a linear classifier for fast training + testing.
In my experients, I have found 'hog2x2' or 'hog3x3' to be most effective as global image features, and tend to perform even better when combined with 'color' features which contain complementary information.
The toolbox works on Matlab and Octave. Octave may still have some compatibility issues though and doesn't support paralell processing.
Before you can use the code, you need to download this repository and compile the mex code:
$ git clone http://github.com/adikhosla/feature-extraction
$ cd feature-extraction
$ matlab
>> compile
To the best of my knowledge, there should be no issues compiling on Linux, Mac or Windows. Octave currently isn't able to compile, but most features should be working.
The basic usage is relatively simple:
>> addpath(genpath('.'));
>> datasets = {'pascal', 'sun'}; % specify name of datasets
>> train_lists = {{'pascal1.jpg'}, {'sun1.jpg', 'sun2.jpg'}}; % specify lists of train images
>> test_lists = {{'pascal2.jpg', 'pascal3.jpg'}, {'sun3.jpg'}}; % specify lists of test images
>> feature = 'hog2x2'; % specify feature to use
>> c = conf(); % load the config structure
>> datasets_feature(datasets, train_lists, test_lists, feature, c); % perform feature extraction
>> train_features = load_feature(datasets{1}, feature, 'train', c); % load train features of pascal
>> test_features = load_feature(datasets{2}, feature, 'test', c); % load test features of sun
The list of available features is:
'color', 'gist', 'hog2x2', 'hog3x3', 'lbp', 'sift', 'ssim'
Details are given here. The datasets_feature function can be run on multiple machines in parallel to speed up feature extraction. This function handles the complete pipeline of building a dictionary (for bag-of-words features), coding features to the dictionary, and pooling them together in a spatial pyramid.
You can use a single or multiple datasets as shown above. A seperate folder will be created for each dataset and a different dictionary will be learned, unless specified otherwise in the configuration structure.
There is a demo script provided in this code to extract color features on a provided set of train and test images. Then the features are used in a nearest neighbor classifier to predict the class of the test images.
>> demo
The demo above will show the train and test images, and the nearest neighbors of the test images from the training set.
There are various options available through the config structure created using the conf() function:
- cache: main folder where all the files will be stored
- feature_config.(feature_name): contains the configuration of feature_name such as dictionary size
- batch_size: batch size for feature processing (reduce for less RAM usage)
- cores: specify number of cores to use for parfor (0 = use all)
- verbosity: used to change how much is output to screen during feature computation (0 = low, 1 = high)
- common_dictionary: used to share a common dictionary across datasets. Dictionary is learned using equal number of samples from each dataset (useful for ECCV 2012 paper).
Additional options are described in conf.m.
There are functions or snippets of code included with or without modification from the following packages:
- Feature coding: Locality-constrained linear coding
- Pixelwise HOG/Gist: LabelMe Toolbox
- For color features: Color Naming
- SIFT features: Spatial pyramid matching code
- LBP features: LBP Matlab code
- SSIM: VGG SSIM package
- Fast k-means
Most of the features included in this toolbox have not been designed by me. I have either fine-tuned them or used them as is from existing work. This toolbox simply unifies existing code bases (which are credited in the bundled code section) in an easy to use architecture. I have done my best to highlight where different snippets of code originate from, but please do not hesitate to contact me if you find that I have missed anything. Most importantly: please cite the original inventors of the different features (in addition to a subset of the references below) when you use this toolbox in your work.
The provided code was used for feature extraction in the following papers:
-
Aditya Khosla, Tinghui Zhou, Tomasz Malisiewicz, Alexei A. Efros, Antonio Torralba.
Undoing the Damage of Dataset Bias, ECCV 2012 -
Aditya Khosla, Jianxiong Xiao, Antonio Torralba, Aude Oliva.
Memorability of Image Regions, NIPS 2012 -
Aditya Khosla, Wilma A. Bainbridge, Antonio Torralba, Aude Oliva.
Modifying the Memorability of Face Photographs, ICCV 2013
Please cite a subset of the above papers if you use this code.
I am extremely grateful to Oscar Beijbom, Hamed Pirsiavash and Tinghui Zhou for being the initial users of this toolbox, and providing useful comments and various bug-fixes.
If you have any feedback, please email Aditya Khosla at [email protected].