If you use this benchmark in your research, please cite our FPGA'17 paper:
@inproceedings{srivastava-facedetect-fpga2017,
title = {Accelerating Face Detection on Programmable SoC Using
C-Based Synthesis},
author = {Nitish Srivastava and Steve Dai and Rajit Manohar and
Zhiru Zhang},
booktitle = {25\textsuperscript{th} ACM/SIGDA International
Symposium on Field-Programmable Gate Arrays},
month = {Feb},
year = {2017},
doi = {10.1145/3020078.3021753},
}
NOTE: If you face any issues while running anything in this repo, shoot me an email at [email protected]
facedetect-fpga is an open-source implementation of Viola-Jones face detection algorithm suitable for C-based synthesis. It was introduced at FPGA-25 in February, 2017.
This design explores the flow from software based implementation to an optimized C/C++ design suitable for High Level Synthesis (HLS) flow. The face detection system is optimized for performance at the C/C++ level and is synthesizable with a full-system compiler SDSoC from Xilinx. The design is suitable for real-time face detection applications and achieves a frame rate of 30 fps. The design has been tested using Vivado HLS 2016 and SDSoC 2016 on ZC-706 board with Xilinx Zynq-7000 XC7Z045 FPGA and ARM Cortex-A9 CPU. The generated RTL code (SystemC, VHDL and Verilog) from the HLS compilers can potentially be used for other purposes as well.
You will need to install Vivado-HLS and SDSoC to compile and run this project.
The face detection benchmark has been written in C. The current version also supports C simulation so that functionality of the algorithm can be tested without going through the painful process of bitstream generation.
To do the C-simulation of the design enter the following command in the main directory:
% make csim
This will create a hls.prj directory containing all the logs for the C simulation. The output of the face-detection can be seen in the directory hls.prj/solution/csim/build/Output.pgm Make sure that all the faces have been detected and marked using rectangles.
One should always do C-simulation before synthesizing the design as it is fast and detects the trivial syntactic/algorithmic errors.
To synthesize the design for FPGA type the following command in the main directory:
% make
This will produce two folders:
./_sds
./sd_card
The _sds folder will have all the reports generated by SDSoC and Vivado-HLS. The sd_card folder will have the bitstream and the associated files which will be used to port face-detection accelerator onto the FPGA.
The timing and utilization reports generated by Vivado-HLS can be found in the following directory:
_sds/vhls/haar/solution/syn/report
The log generated by Vivado-HLS can be found as:
_sds/vhls/vivado_hls.log
The area, utilzation and timing reports generated by SDSoC can be found in the following directory:
_sds/reports
For reference I have uploaded these two folders from my own compilation here: http://people.ece.cornell.edu/nks45/face-detect-compiled/
To port the face-detection accelerator onto the FPGA copy all the files in the sd_card folder into the SD-card of the FPGA. After that you can reboot the FPGA board and connect it to your computer using UART. Go to the mnt folder and run face.elf binary
$ cd /mnt
$ ./face.elf
This will print the co-ordinates of all the rectangles detected on the screen and will also give the time taken to detect the faces. Wait for a while ( 4-5 seconds ) and remove the SD-card and insert it in your computer. You will see an Output.pgm file with rectangles drawn around all the faces. The wait time of 4-5 seconds is to make sure that this image is written properly.
The face-detection algorithm can only process the images when each pixel in the image is represented as a 8-bit number. The gen_dataset folder has the files to generate the hex images using the images in pgm format. To add a new 320 X 240 image (pgm format) copy it to the folder gen_dataset. Edit the input and the output file names in the main function in gen_dataset/gen_image.cpp and compile and run it:
% cd gen_dataset
% cp /path/to/image/test-image.pgm .
Compile the code
% g++ gen_image.cpp -o gen_image
% ./gen_image test-image.pgm test-image.h
This will produce the .h file for the image with the image pixels in the hex format. Note that the original pgm image must be 320 X 240 pixels. Copy this .h file to the main directory:
% cp test-image.h ../.
And then edit the main.cpp file to have this line instead:
#include "image0_320_240.h"
Now you have added a new image. Do the C-simulation as described above. Make sure the C-simulation is working and producing the right results. Generate the bitstream and run it on the FPGA.
The source code for different stages of optimizations mentioned in our paper is also added to the repository so that users can get a feel of the effects of each and every optimzation. There are 3 extra folders which are added to the repo:
Baseline : This folder has the baseline implementation which replaces all the non synthesizable constructs.
Pipelined : This folder has the code when all the classifiers are pipelined.
Parallel_and_Pipeline : This folder has the code when the classifiers in the first 3 stages are in parallel and the ones in the other 22 stages are in pipeline.
The main folder has the code which combines all these optimizations.
The resource usage at each stage of optimization is given below:
face-detection benchmark is offered under the terms of the Open Source Initiative BSD 3-Clause License. More information about this license can be found here: