This example demonstrates OpenVINO™ toolkit integration with facial detection, using basic depth information to approximate distance.
This sample makes use of OpenCV.
A helper namespace openvino_helpers
is used, with a helper class
face_detection
encapsulating much of the OpenVINO details:
openvino_helpers::face_detection faceDetector(
"face-detection-adas-0001.xml",
0.5 // Probability threshold -- anything with less confidence will be thrown out
);
There are two trained model Intermediate Representation files
(face-detection-adas-0001.xml
and .bin
) that need to be loaded. Pointing to
the .xml
is enough. These are automatically installed into your build's
wrappers/openvino/face
directory.
The
face_detection
class checks that the model includes the required input/output layers, so feel free to substitute different models.
Each detection has a confidence
score. You can specify how confident you
want the results to be.
Asynchronous detection takes place by queueing a frame and only processing its results when the next frame is available:
// Wait for the results of the previous frame we enqueued: we're going to process these
faceDetector.wait();
auto results = faceDetector.fetch_results();
// Enqueue the current frame so we'd get the results when the next frame comes along!
faceDetector.enqueue( image );
faceDetector.submit_request();
// Process the results...
Detected faces are placed into a container and assigned IDs. Some basic effort is made to keep the creation of new faces to a minimum: previous faces are compared with new detections to see if the new are simply new positions for the old. An "intersection over union" (IoU) quotient is calculated and, if over a threshold, an existing face is moved rather than a new face created.
rect = rect & cv::Rect( 0, 0, image.cols, image.rows );
auto face_ptr = openvino_helpers::find_face( rect, prev_faces );
if( !face_ptr )
// New face
face_ptr = std::make_shared< openvino_helpers::detected_face >( id++, rect );
else
// Existing face; just update its parameters
face_ptr->move( rect );
Depth is arrived at very simplistically: the center coordinates of each face on the color frame is converted to a fraction in terms of the frame width and height, and then re-calculated in terms of the depth frame's width and height.
This naïve way is OK for basic estimation, but the frames should ideally be aligned if proper correspondence is required. See the rs-face-dlib example.