Instance segmentation inference testing with MaskRCNN

This demo runs an instance segmentation algorithm on frames from the COCO dataset. The demo consists of four parts:

  • CVNodeManager - manages testing scenario and data flow between dataprovider and tested MaskRCNN node.
  • CVNodeManagerGUI - visualizes input data and results of inference testing.
  • Kenning - provides images to the MaskRCNN node and collects inference results.
  • MaskRCNN - runs inference on input images and returns results.

Necessary dependencies

This demo requires:

  • A CUDA-enabled NVIDIA GPU for inference acceleration
  • repo tool to clone all necessary repositories
  • Docker to use a prepared environment
  • nvidia-container-toolkit to provide access to the GPU in the Docker container.

All the necessary build, runtime and development dependencies are provided in the Dockerfile. The image contains:

To build the Docker image containing all necessary dependencies, run:

sudo ./

For more details regarding base image refer to the ROS2 GuiNode.

Preparing the environment

First off, create a workspace directory to store downloaded repositories:

mkdir cvnode && cd cvnode

Download all dependencies using the repo tool:

repo init -u -m examples/manifest.xml -b main

repo sync -j`nproc`

It downloads the following repositories:

Starting the Docker environment

If you are using the Docker container, allow non-network local connections to X11 so that the GUI can be started from the Docker container:

xhost +local:

Then, run a Docker container under the previously created cvnode workspace directory:


NOTE: In case you have built the image manually, e.g. with name ros2-humble-cuda-torch, run:

DOCKER_IMAGE=ros2-humble-cuda-torch ./src/cvnode_base/examples/mask_rcnn/

This script starts the image with:

  • -v $(pwd):/data - mounts current (cvnode) directory in the /data directory in the container's context
  • -v /tmp/.X11-unix/:/tmp/.X11-unix/ - passes the X11 socket directory to the container's context (to allow running GUI application)
  • -e DISPLAY=$DISPLAY, -e XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR - adds X11-related environment variables
  • --gpus='all,"capabilities=compute,utility,graphics,display"' - adds GPUs to the container's context for compute and display purposes

Then, in the Docker container, install graphics libraries for NVIDIA that match your host's drivers. To check NVIDIA driver versions, run:


And check Driver version.

For example, for 530.41.03, install the following in the container:

apt-get update && apt-get install libnvidia-gl-530

Then, go to the workspace directory in the container:

cd /data

Finally, install Kenning:

pip install kenning/

Exporting MaskRCNN to TorchScript

MaskRCNN model can be exported to TorchScript with the script. The script takes the following arguments:

  • --image - path to the image to run inference on
  • --output - path to the directory where the exported model will be stored
  • --method - method for model export. Should be one of: onnx, torchscript
  • --num-classes - optional argument indicating amount of classes to use in model architecture
  • --weights - optional argument indicating path to the file storage weights. By default, fetches COCO pre-trained model weights from model zoo.

For example, to export the model to TorchScript and locate it in the config directory, run:

curl --output image.jpg

/data/src/cvnode_base/examples/mask_rcnn/ \
    --image image.jpg \
    --output /data/src/cvnode_base/examples/config \
    --method torchscript

This will download an image from the COCO dataset and export the model to the config directory. Later, the model can be loaded with the launch file.

Building the MaskRCNN demo

First, source the ROS2 environment:

source /opt/ros/

Then, build the GUI node and the Camera node:

colcon build --base-path=src/ --packages-select \
    kenning_computer_vision_msgs \
    cvnode_base \
    cvnode_manager \

Here, the --cmake-args are:

  • -DBUILD_GUI=ON - builds the GUI for CVNodeManager
  • -DBUILD_TORCHVISION=ON - builds the TorchVision library needed for MaskRCNN

Source the build targets with:

source install/

Running the MaskRCNN demo

CVNode provides two launch scripts for running the demo:

  • - runs the MaskRCNN node with Python Detectron2 backend
  • - runs the MaskRCNN node with C++ TorchScript backend

You can run a sample launch with a Python backend with:

ros2 launch cvnode_base \
    class_names_path:=/data/src/cvnode_base/examples/config/coco_classes.csv \
    inference_configuration:=/data/src/cvnode_base/examples/config/coco_inference.json \
    publish_visualizations:=True \
    preserve_output:=False \
    scenario:=real_world_last \
    inference_timeout_ms:=100 \
    measurements:=/data/build/ros2_detectron_measurements.json \
    report_path:=/data/build/reports/detectron_real_world_last/ \

For a C++ backend, run:

ros2 launch cvnode_base \
    model_path:=/data/src/cvnode_base/examples/config/model.ts \
    class_names_path:=/data/src/cvnode_base/examples/config/coco_classes.csv \
    inference_configuration:=/data/src/cvnode_base/examples/config/coco_inference.json \
    publish_visualizations:=True \
    preserve_output:=False \
    scenario:=real_world_last \
    inference_timeout_ms:=100 \
    measurements:=/data/build/ros2_torchscript_measurements.json \
    report_path:=/data/build/reports/torchscript_real_world_last/ \

Here, the parameters are:

  • model_path - path to a TorchScript model
  • class_names_path - path to a CSV file with class names
  • inference_configuration - path to a JSON file with Kenning's inference configuration
  • publish_visualizations - whether to publish visualizations for the GUI
  • preserve_output - whether to preserve the output of the last inference if timeout is reached
  • scenario - scenario for running the demo, one of:
    • real_world_last - tries to process last received frame within timeout
    • real_world_first - tries to process first received frame
    • synthetic - ignores timeout and processes frames as fast as possible
  • inference_timeout_ms - timeout for inference in milliseconds. Used only by real_world scenarios
  • measurements - path to file where inference measurements will be stored
  • report_path - path to file where the rendered report will be stored
  • log_level - log level for running the demo.

The produced reports can later be found in the /data/build/reports directory.