Skip to content

Commit

Permalink
Merge pull request #48 from NVIDIA-ISAAC-ROS/release-3.2
Browse files Browse the repository at this point in the history
Isaac ROS 3.2
  • Loading branch information
jaiveersinghNV authored Dec 11, 2024
2 parents 4c47edf + 46da6f1 commit 06a676b
Show file tree
Hide file tree
Showing 45 changed files with 670 additions and 212 deletions.
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,10 +63,10 @@ This package is powered by [NVIDIA Isaac Transport for ROS (NITROS)](https://dev

## Performance

| Sample Graph<br/><br/> | Input Size<br/><br/> | AGX Orin<br/><br/> | Orin NX<br/><br/> | Orin Nano 8GB<br/><br/> | x86_64 w/ RTX 4060 Ti<br/><br/> | x86_64 w/ RTX 4090<br/><br/> |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [RT-DETR Object Detection Graph](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/benchmarks/isaac_ros_rtdetr_benchmark/scripts/isaac_ros_rtdetr_graph.py)<br/><br/><br/>SyntheticaDETR<br/><br/> | 720p<br/><br/><br/><br/> | [71.9 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_rtdetr_graph-agx_orin.json)<br/><br/><br/>24 ms @ 30Hz<br/><br/> | [30.8 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_rtdetr_graph-orin_nx.json)<br/><br/><br/>41 ms @ 30Hz<br/><br/> | [21.3 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_rtdetr_graph-orin_nano.json)<br/><br/><br/>61 ms @ 30Hz<br/><br/> | [205 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_rtdetr_graph-nuc_4060ti.json)<br/><br/><br/>8.7 ms @ 30Hz<br/><br/> | [400 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_rtdetr_graph-x86_4090.json)<br/><br/><br/>6.3 ms @ 30Hz<br/><br/> |
| [DetectNet Object Detection Graph](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/benchmarks/isaac_ros_detectnet_benchmark/scripts/isaac_ros_detectnet_graph.py)<br/><br/><br/><br/> | 544p<br/><br/><br/><br/> | [165 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_detectnet_graph-agx_orin.json)<br/><br/><br/>20 ms @ 30Hz<br/><br/> | [115 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_detectnet_graph-orin_nx.json)<br/><br/><br/>26 ms @ 30Hz<br/><br/> | [63.2 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_detectnet_graph-orin_nano.json)<br/><br/><br/>36 ms @ 30Hz<br/><br/> | [488 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_detectnet_graph-nuc_4060ti.json)<br/><br/><br/>10 ms @ 30Hz<br/><br/> | [589 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_detectnet_graph-x86_4090.json)<br/><br/><br/>10 ms @ 30Hz<br/><br/> |
| Sample Graph<br/><br/> | Input Size<br/><br/> | AGX Orin<br/><br/> | Orin NX<br/><br/> | Orin Nano 8GB<br/><br/> | x86_64 w/ RTX 4090<br/><br/> |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [RT-DETR Object Detection Graph](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/benchmarks/isaac_ros_rtdetr_benchmark/scripts/isaac_ros_rtdetr_graph.py)<br/><br/><br/>SyntheticaDETR<br/><br/> | 720p<br/><br/><br/><br/> | [56.5 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_rtdetr_graph-agx_orin.json)<br/><br/><br/>30 ms @ 30Hz<br/><br/> | [33.8 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_rtdetr_graph-orin_nx.json)<br/><br/><br/>39 ms @ 30Hz<br/><br/> | [24.1 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_rtdetr_graph-orin_nano.json)<br/><br/><br/>53 ms @ 30Hz<br/><br/> | [490 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_rtdetr_graph-x86-4090.json)<br/><br/><br/>7.1 ms @ 30Hz<br/><br/> |
| [DetectNet Object Detection Graph](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/benchmarks/isaac_ros_detectnet_benchmark/scripts/isaac_ros_detectnet_graph.py)<br/><br/><br/><br/> | 544p<br/><br/><br/><br/> | [70.5 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_detectnet_graph-agx_orin.json)<br/><br/><br/>26 ms @ 30Hz<br/><br/> | [30.1 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_detectnet_graph-orin_nx.json)<br/><br/><br/>46 ms @ 30Hz<br/><br/> | [22.9 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_detectnet_graph-orin_nano.json)<br/><br/><br/>57 ms @ 30Hz<br/><br/> | [254 fps](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_benchmark/blob/main/results/isaac_ros_detectnet_graph-x86-4090.json)<br/><br/><br/>11 ms @ 30Hz<br/><br/> |

---

Expand Down Expand Up @@ -97,4 +97,4 @@ Please visit the [Isaac ROS Documentation](https://nvidia-isaac-ros.github.io/re

## Latest

Update 2024-09-26: Update for ZED compatibility
Update 2024-12-10: Update to be compatible with JetPack 6.1
6 changes: 6 additions & 0 deletions gxf_isaac_detectnet/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -63,4 +63,10 @@ set_target_properties(${PROJECT_NAME} PROPERTIES
# Install the binary file
install(TARGETS ${PROJECT_NAME} DESTINATION share/${PROJECT_NAME}/gxf/lib)


# Embed versioning information into installed files
ament_index_get_resource(ISAAC_ROS_COMMON_CMAKE_PATH isaac_ros_common_cmake_path isaac_ros_common)
include("${ISAAC_ROS_COMMON_CMAKE_PATH}/isaac_ros_common-version-info.cmake")
generate_version_info(${PROJECT_NAME})

ament_auto_package(INSTALL_TO_SHARE)
4 changes: 3 additions & 1 deletion gxf_isaac_detectnet/gxf/detectnet/detectnet.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,13 @@
// limitations under the License.
//
// SPDX-License-Identifier: Apache-2.0

#include <string>
#include <vector>

#include "detectnet/detectnet_decoder.hpp"
#include "gxf/core/gxf.h"
#include "gxf/std/extension_factory_helper.hpp"
#include "detectnet/detectnet_decoder.hpp"

extern "C" {

Expand Down
63 changes: 33 additions & 30 deletions gxf_isaac_detectnet/gxf/detectnet/detectnet_decoder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,18 +15,21 @@
//
// SPDX-License-Identifier: Apache-2.0

#include "detectnet_decoder.hpp"
#include "detection2_d_array_message.hpp"

#include <string>
#include <climits>
#include <memory>
#include <string>
#include <vector>

#include "./detection2_d_array_message.hpp"
#include "./detectnet_decoder.hpp"

#include "cuda.h"
#include "cuda_runtime.h"

#include "gxf/core/parameter_parser_std.hpp"
#include "gxf/multimedia/camera.hpp"
#include "gxf/multimedia/video.hpp"
#include "gxf/core/parameter_parser_std.hpp"
#include "gxf/std/timestamp.hpp"
#include "cuda.h"
#include "cuda_runtime.h"


namespace nvidia
Expand Down Expand Up @@ -67,10 +70,10 @@ NvDsInferObjectDetectionInfo GetNewDetectionInfo(
}

void FillMessage(
Detection2DParts &message_parts,
const std::vector<NvDsInferObjectDetectionInfo> &detection_info_vector,
Detection2DParts& message_parts,
const std::vector<NvDsInferObjectDetectionInfo>& detection_info_vector,
gxf::Handle<nvidia::gxf::Timestamp> tensorlist_timestamp,
size_t num_detections, const std::vector<std::string> &label_list)
size_t num_detections, const std::vector<std::string>& label_list)
{
for (uint32_t i = 0; i < num_detections; i++) {
NvDsInferObjectDetectionInfo detection_info = detection_info_vector[i];
Expand All @@ -89,7 +92,7 @@ void FillMessage(
}
*(message_parts.timestamp) = *tensorlist_timestamp;
}
} // anonymous namespace
} // anonymous namespace


gxf_result_t DetectnetDecoder::registerInterface(gxf::Registrar * registrar) noexcept
Expand All @@ -106,8 +109,8 @@ gxf_result_t DetectnetDecoder::registerInterface(gxf::Registrar * registrar) noe

result &= registrar->parameter(
label_list_, "label_list", "List of network labels",
"List of labels corresponding to the int labels received from the tensors", {"person", "bag",
"face"});
"List of labels corresponding to the int labels received from the tensors",
{"person", "bag", "face"});

result &= registrar->parameter(
enable_confidence_threshold_, "enable_confidence_threshold", "Enable Confidence Threshold",
Expand All @@ -131,27 +134,27 @@ gxf_result_t DetectnetDecoder::registerInterface(gxf::Registrar * registrar) noe

result &= registrar->parameter(
dbscan_confidence_threshold_, "dbscan_confidence_threshold", "Dbscan Confidence Threshold",
"Minimum score in a cluster for the cluster to be considered an object \
during grouping. Different clustering may cause the algorithm \
to use different scores.",
"Minimum score in a cluster for the cluster to be considered an object "
"during grouping. Different clustering may cause the algorithm "
"to use different scores.",
0.6);

result &= registrar->parameter(
dbscan_eps_, "dbscan_eps", "Dbscan Epsilon",
"Holds the epsilon to control merging of overlapping boxes. \
Refer to OpenCV groupRectangles and DBSCAN documentation for more information on epsilon. ",
"Holds the epsilon to control merging of overlapping boxes. "
"Refer to OpenCV groupRectangles and DBSCAN documentation for more information on epsilon. ",
0.01);

result &= registrar->parameter(
dbscan_min_boxes_, "dbscan_min_boxes", "Dbscan Minimum Boxes",
"Holds the minimum number of boxes in a cluster to be considered \
an object during grouping using DBSCAN",
"Holds the minimum number of boxes in a cluster to be considered "
"an object during grouping using DBSCAN",
1);

result &= registrar->parameter(
dbscan_enable_athr_filter_, "dbscan_enable_athr_filter", "Dbscan Enable Athr Filter",
"true enables the area-to-hit ratio (ATHR) filter. \
The ATHR is calculated as: ATHR = sqrt(clusterArea) / nObjectsInCluster.",
"true enables the area-to-hit ratio (ATHR) filter. "
"The ATHR is calculated as: ATHR = sqrt(clusterArea) / nObjectsInCluster.",
0);

result &= registrar->parameter(
Expand Down Expand Up @@ -196,7 +199,6 @@ gxf_result_t DetectnetDecoder::start() noexcept

gxf_result_t DetectnetDecoder::tick() noexcept
{

gxf::Expected<void> result;

// Receive disparity image and left/right camera info
Expand Down Expand Up @@ -272,9 +274,10 @@ gxf_result_t DetectnetDecoder::tick() noexcept
return GXF_FAILURE;
}

float bbox_tensor_arr[bbox_tensor->size() / sizeof(float)]; // since data in tensor is kFloat32
// data in tensor is kFloat32
std::vector<float> bbox_tensor_arr(bbox_tensor->size() / sizeof(float));
const cudaError_t cuda_error_bbox_tensor = cudaMemcpy(
&bbox_tensor_arr, bbox_tensor->pointer(),
bbox_tensor_arr.data(), bbox_tensor->pointer(),
bbox_tensor->size(), cudaMemcpyDeviceToHost);
if (cuda_error_bbox_tensor != cudaSuccess) {
GXF_LOG_ERROR("Error while copying kernel: %s", cudaGetErrorString(cuda_error_bbox_tensor));
Expand Down Expand Up @@ -315,8 +318,8 @@ gxf_result_t DetectnetDecoder::tick() noexcept
float coverage = cov_tensor_arr[cov_pos];

// Center of the grid in pixels
float grid_center_y = (row + bounding_box_offset_ ) * kStride;
float grid_center_x = (col + bounding_box_offset_ ) * kStride;
float grid_center_y = (row + bounding_box_offset_) * kStride;
float grid_center_x = (col + bounding_box_offset_) * kStride;

// Get each element of the bounding box
float bbox[kBoundingBoxParams];
Expand All @@ -342,7 +345,8 @@ gxf_result_t DetectnetDecoder::tick() noexcept
// check if object_class is out of range for label_list_
if (static_cast<size_t>(object_class) >= label_list_.get().size()) {
GXF_LOG_ERROR(
"[DetectNet Decoder] object_class %i is out of range for provided label_list_ of size %lu", object_class,
"[DetectNet Decoder] object_class %i is out of range for provided "
"label_list_ of size %lu", object_class,
label_list_.get().size());
return GXF_FAILURE;
}
Expand All @@ -360,7 +364,7 @@ gxf_result_t DetectnetDecoder::tick() noexcept

size_t num_detections = detection_info_vector.size();
if (enable_dbscan_clustering_) {
NvDsInferObjectDetectionInfo * detection_info_pointer = &detection_info_vector[0];
NvDsInferObjectDetectionInfo* detection_info_pointer = &detection_info_vector[0];
NvDsInferDBScanHandle dbscan_hdl = NvDsInferDBScanCreate();
if (dbscan_clustering_algorithm_ == kDbscanCluster) {
NvDsInferDBScanCluster(dbscan_hdl, &params_, detection_info_pointer, &num_detections);
Expand All @@ -386,7 +390,6 @@ gxf_result_t DetectnetDecoder::tick() noexcept
num_detections, label_list_);
return detections_transmitter_->publish(message_parts.message);
}));

}
} // namespace isaac_ros
} // namespace nvidia
10 changes: 7 additions & 3 deletions gxf_isaac_detectnet/gxf/detectnet/detectnet_decoder.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,19 @@
#ifndef NVIDIA_ISAAC_ROS_EXTENSIONS_DETECTNET_DECODER_HPP_
#define NVIDIA_ISAAC_ROS_EXTENSIONS_DETECTNET_DECODER_HPP_

#include <string>
#include <vector>

#include "./detection2_d_array_message.hpp"
#include "deepstream_utils/nvdsinferutils/dbscan/nvdsinfer_dbscan.hpp"

#include "gxf/core/entity.hpp"
#include "gxf/core/gxf.h"
#include "gxf/core/parameter.hpp"
#include "gxf/std/codelet.hpp"
#include "gxf/core/parameter_parser_std.hpp"
#include "gxf/std/codelet.hpp"
#include "gxf/std/receiver.hpp"
#include "gxf/std/transmitter.hpp"
#include "detection2_d_array_message.hpp"
#include "deepstream_utils/nvdsinferutils/dbscan/nvdsinfer_dbscan.hpp"


namespace nvidia
Expand Down
2 changes: 1 addition & 1 deletion gxf_isaac_detectnet/package.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ SPDX-License-Identifier: Apache-2.0
<?xml-model href="http://download.ros.org/schema/package_format3.xsd" schematypens="http://www.w3.org/2001/XMLSchema"?>
<package format="3">
<name>gxf_isaac_detectnet</name>
<version>3.1.0</version>
<version>3.2.0</version>
<description>Detectnet GXF extension.</description>

<maintainer email="[email protected]">Isaac ROS Maintainers</maintainer>
Expand Down
8 changes: 7 additions & 1 deletion isaac_ros_detectnet/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ if(BUILD_TESTING)
endif()

find_package(launch_testing_ament_cmake REQUIRED)
add_launch_test(test/isaac_ros_detectnet_pol_test.py TIMEOUT "600")
add_launch_test(test/isaac_ros_detectnet_pol_test.py TIMEOUT "900")
endif()

# Visualizer python scripts
Expand All @@ -63,4 +63,10 @@ install(DIRECTORY
DESTINATION share/${PROJECT_NAME}
)


# Embed versioning information into installed files
ament_index_get_resource(ISAAC_ROS_COMMON_CMAKE_PATH isaac_ros_common_cmake_path isaac_ros_common)
include("${ISAAC_ROS_COMMON_CMAKE_PATH}/isaac_ros_common-version-info.cmake")
generate_version_info(${PROJECT_NAME})

ament_auto_package(INSTALL_TO_SHARE config launch)
6 changes: 3 additions & 3 deletions isaac_ros_detectnet/config/hawk_config.pbtxt
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,20 @@ platform: "tensorrt_plan"
max_batch_size: 16
input [
{
name: "input_1"
name: "input_1:0"
data_type: TYPE_FP32
format: FORMAT_NCHW
dims: [ 3, 1200, 1920 ]
}
]
output [
{
name: "output_bbox/BiasAdd"
name: "output_bbox/BiasAdd:0"
data_type: TYPE_FP32
dims: [ 12, 75, 120]
},
{
name: "output_cov/Sigmoid"
name: "output_cov/Sigmoid:0"
data_type: TYPE_FP32
dims: [ 3, 75, 120]
}
Expand Down
6 changes: 3 additions & 3 deletions isaac_ros_detectnet/config/isaac_sim_config.pbtxt
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,20 @@ platform: "tensorrt_plan"
max_batch_size: 16
input [
{
name: "input_1"
name: "input_1:0"
data_type: TYPE_FP32
format: FORMAT_NCHW
dims: [ 3, 720, 1280 ]
}
]
output [
{
name: "output_bbox/BiasAdd"
name: "output_bbox/BiasAdd:0"
data_type: TYPE_FP32
dims: [ 12, 45, 80]
},
{
name: "output_cov/Sigmoid"
name: "output_cov/Sigmoid:0"
data_type: TYPE_FP32
dims: [ 3, 45, 80]
}
Expand Down
12 changes: 6 additions & 6 deletions isaac_ros_detectnet/config/peoplenet_config.pbtxt
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,22 @@ platform: "tensorrt_plan"
max_batch_size: 16
input [
{
name: "input_1"
name: "input_1:0"
data_type: TYPE_FP32
format: FORMAT_NCHW
dims: [ 3, 544, 960 ]
dims: [ 3, 544, 960]
}
]
output [
{
name: "output_bbox/BiasAdd"
name: "output_bbox/BiasAdd:0"
data_type: TYPE_FP32
dims: [ 12, 34, 60 ]
dims: [ 12, 34, 60]
},
{
name: "output_cov/Sigmoid"
name: "output_cov/Sigmoid:0"
data_type: TYPE_FP32
dims: [ 3, 34, 60 ]
dims: [ 3, 34, 60]
}
]
dynamic_batching { }
Expand Down
12 changes: 6 additions & 6 deletions isaac_ros_detectnet/config/quickstart_config.pbtxt
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,22 @@ platform: "tensorrt_plan"
max_batch_size: 16
input [
{
name: "input_1"
name: "input_1:0"
data_type: TYPE_FP32
format: FORMAT_NCHW
dims: [ 3, 632, 1200 ]
dims: [ 3, 544, 960]
}
]
output [
{
name: "output_bbox/BiasAdd"
name: "output_bbox/BiasAdd:0"
data_type: TYPE_FP32
dims: [ 12, 40, 75]
dims: [ 12, 34, 60]
},
{
name: "output_cov/Sigmoid"
name: "output_cov/Sigmoid:0"
data_type: TYPE_FP32
dims: [ 3, 40, 75]
dims: [ 3, 34, 60]
}
]
dynamic_batching { }
Expand Down
6 changes: 3 additions & 3 deletions isaac_ros_detectnet/config/realsense_config.pbtxt
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,20 @@ platform: "tensorrt_plan"
max_batch_size: 16
input [
{
name: "input_1"
name: "input_1:0"
data_type: TYPE_FP32
format: FORMAT_NCHW
dims: [ 3, 480, 640 ]
}
]
output [
{
name: "output_bbox/BiasAdd"
name: "output_bbox/BiasAdd:0"
data_type: TYPE_FP32
dims: [ 12, 30, 40]
},
{
name: "output_cov/Sigmoid"
name: "output_cov/Sigmoid:0"
data_type: TYPE_FP32
dims: [ 3, 30, 40]
}
Expand Down
Loading

0 comments on commit 06a676b

Please sign in to comment.