YOLO

Pretrained YOLO (PASCAL VOC 2012)

In case the node is launched with post_proc:=YOLO argument, then a special YOLO post-processing is applied. In such case, the node publishes output of the DNN using standard Image message in a certain format: the output is a 2D, single-channel "image" that has the following format: WxHx1 (so encoding == 32FC1) where W is fixed and equals 6, and H is equal to the number of detected objects. For example, if the DNN has detected 2 objects, then the output is 6x2 image. For each detected object, the 6 values are the following:

0  : label (class) of the detected object (e.g. person or a dog).
1  : probability of this object.
2,3: x and y coordinates of the top left corner of the object in image coordinates.
4,5: width and height of the object in image coordinates.

All values are 32-bit floats, including label. Label indices correspond to 20 classes from PASCAL VOC 2012 dataset.

For example, if DNN detected a person (label:14) and a dog (label: 12) in the image with dimensions 320x180 then the output might look something like that:

Label	Prob	X	Y	Width	Height
14.0	0.5	120.0	80.0	30.0	60.0
12.0	0.4	160.0	115.0	40.0	20.0

Home

Reference Models

Segmentation
Custom
YOLO

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YOLO

Pretrained YOLO (PASCAL VOC 2012)

Home

Reference Models

Clone this wiki locally