Code repository for methods proposed in 'Vision-Based Object Recognition in Indoor Environments using Topologically Persistent Features'. [Pre-print]
The UW Indoor Scenes (UW-IS) dataset used in the above paper can be found here. The dataset consists of scenes from two different environments, namely, a living room and a mock warehouse. The scenes are captured using varying camera poses under different illumination conditions and include up to five different objects from a given set of fourteen objects.
- Tensorflow Models DeepLabv3+
- Keras
- giotto-tda=0.2.2
- persim=0.1.2
- Python 3.6
Pipeline of segmentation map generation module
- Install the DeepLabv3+ implementation available through Tensorflow models following the installation instructions here.
- Under the
tensorflow/models/resarch/deeplab
directory create the following recommended directory structure for training a DeepLabv3+ model in the living room environment of the UW-IS dataset. (The filestrain_uwis.sh
,export_uwis.sh
, andconvert_uwis.sh
can be found under thesegMapUtils
folder in this repository.)
+ deeplab
- train_uwis.sh
- export_uwis.sh
- loadmodel_inference.py
+ datasets (Note: merge this folder with the pre-existing datasets folder)
- convert_uwis.sh
+ uwis
+ init_models
+ data
+ JPEGImages_livingroom
+ foreground_livingroom
+ ImageSets_livingroom
- Place living room scene images to be used for training the DeepLabv3+ model under
JPEGImages_livingroom
and corresponding ground truth segmentation maps underforeground_livingroom
. - Modify the existing files
datasets/build_voc2012_data.py
anddatasets/data_generator.py
appropriately for the UW-IS dataset. - Use
segMapUtils/convertToRaw.py
to convert binary segmentation maps to raw annotations (pixel value indicates class labels). Then usesegMapUtils/createTrainValSets.py
to generate the training and validation sets for training the DeepLabv3+ model. Runconvert_uwis.sh
from within thedeeplab/datasets
directory to convert annotations into tensorflow records for the model. - Place appropriate initial checkpoint available from here in the
init_models
folder. - Use
train_uwis.sh
followed byexport_uwis.sh
from within thedeeplab
directory to train a DeepLabv3+ model and to export the trained model, respectively. - Run
loadmodel_inference.py
from within thedeeplab
directory to generate segmentation maps for scene images using the trained model. - Use
segMapUtils/cropPredsObjectWise.py
to obtain cropped object images from the scene segmentation map. - Run
loadmodel_inference.py
again (using the same trained model) to generate object segmentatipn maps for all the cropped object images.
At this stage, object segmentation maps would have the following filename structure <sceneImageName>_<cropId>_cropped.png
. Before moving to the next step, all the object segmentation maps are to be labeled with appropriate object class id for training the recognition networks. The steps in persistent features extraction and recognition assume the following filename structure for object segmentation maps:
<sceneImageName>_<cropId>_cropped_obj<classId>.png
.
Pipeline for object recognition using sparse PI features
All the steps below refer to code files under the persistentFeatRecognit
folder in this repository.
- Generate persistence diagrams for the object segmentation maps using
generatePDs.py
- To generate sparse PI features from the persistence diagrams, run
generatePIs.py
to obtain persistence images (PIs) followed bysparseSamplingPIs.py
. The scriptsparseSamplingPIs.py
generates optimal pixel locations for the PIs that can be used to obtain sparse PIs. To generate amplitude features, usegenerateAmplitude.py
- Use
trainRecognitSparsePI.py
to train a recognition network using sparse PI features. The file loads generated PIs and obtains sparse PIs using the optimal pixel locations generated in the previous step. To train a recognition network using amplitude features, usetrainRecognitAmplitude.py
. - To test the performance of the recognition networks in the same environment that they are trained on (i.e., living room in the default case), use
predictFromSparsePIs_test_trainEnv.py
orpredictFromAmplitude_test_trainEnv.py
as appropriate. - To test the performance of the recognition networks in unseen environments (i.e., mock warehouse in the default case), generate object segmentation maps from the warehouse images as described above. Then, obtain persitence diagrams from the object segmentation maps. From the persistence diagrams generate PIs and amplitude for the object segmentation maps.
- To test the sparse PI recognition network's performance use
predictFromSparsePIs_test_testEnv.py
. It uses the same optimal pixel locations obtained at the time of training to obtain sparse PIs, and makes predictions using the trained model. - To test the amplitude recognition network's performance use
predictFromAmplitude_test_testEnv.py
.
- To test the sparse PI recognition network's performance use