ROS2 Jazzy fails to build because of OpenCV #702

0Unkn0wn · 2024-10-28T01:28:14Z

I tried building an image with ROS2 Jazzy and Jax and it seems that OpenCV failed to install which stops me from creating the image.
I have also tried Humble, and it also fails on the same step if the Python version is set to 3.11. Any idea what happened?

Command:
PYTHON_VERSION=3.11 jetson-containers build --name=ros2jazzy_jax_base ros:jazzy-desktop jax:0.4.32

Build log

jetson@ubuntu ~/r/jetson-containers (master)> PYTHON_VERSION=3.11 jetson-containers build --name=ros2jazzy_jax_base ros:jazzy-desktop jax:0.4.32
Namespace(packages=['ros:jazzy-desktop', 'jax:0.4.32'], name='ros2jazzy_jax_base', base='', multiple=False, build_flags='', build_args='', package_dirs=[''], list_packages=False, show_packages=False, skip_packages=[''], skip_errors=False, skip_tests=[''], test_only=[''], simulate=False, push='', logs='', verbose=False, no_github_api=False)
-- L4T_VERSION=36.4.0
-- JETPACK_VERSION=6.1
-- CUDA_VERSION=12.6
-- PYTHON_VERSION=3.11
-- LSB_RELEASE=22.04 (jammy)
-- Building containers  ['build-essential', 'pip_cache:cu126', 'cuda:12.6', 'cudnn:9.4', 'python', 'tensorrt', 'numpy', 'opencv', 'cmake', 'ros:jazzy-desktop', 'jax:0.4.32']
-- Building container ros2jazzy_jax_base:r36.4.0-cp311-build-essential

DOCKER_BUILDKIT=0 docker build --network=host --tag ros2jazzy_jax_base:r36.4.0-cp311-build-essential \
--file /home/jetson/repos/jetson-containers/packages/build/build-essential/Dockerfile \
--build-arg BASE_IMAGE=ubuntu:22.04 \
/home/jetson/repos/jetson-containers/packages/build/build-essential \
2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_012330/build/ros2jazzy_jax_base_r36.4.0-cp311-build-essential.txt; exit ${PIPESTATUS[0]}

DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            BuildKit is currently disabled; enable it by removing the DOCKER_BUILDKIT=0
            environment-variable.

Sending build context to Docker daemon  19.97kB
Step 1/5 : ARG BASE_IMAGE
Step 2/5 : FROM ${BASE_IMAGE}
 ---> 981912c48e9a
Step 3/5 : ENV DEBIAN_FRONTEND=noninteractive     LANGUAGE=en_US:en     LANG=en_US.UTF-8     LC_ALL=en_US.UTF-8
 ---> Using cache
 ---> 978f6db217f5
Step 4/5 : RUN set -ex     && apt-get update     && apt-get install -y --no-install-recommends         locales         locales-all         tzdata     && locale-gen en_US $LANG     && update-locale LC_ALL=$LC_ALL LANG=$LANG     && locale         && apt-get install -y --no-install-recommends         build-essential         software-properties-common         apt-transport-https         ca-certificates         lsb-release         pkg-config         gnupg         git         gdb         wget         curl         nano         zip         unzip         time         sshpass         ssh-client     && apt-get clean     && rm -rf /var/lib/apt/lists/*         && gcc --version     && g++ --version
 ---> Using cache
 ---> bdeb5fdb0222
Step 5/5 : COPY tarpack /usr/local/bin/
 ---> Using cache
 ---> d87978b51f05
Successfully built d87978b51f05
Successfully tagged ros2jazzy_jax_base:r36.4.0-cp311-build-essential
-- Building container ros2jazzy_jax_base:r36.4.0-cp311-pip_cache_cu126

DOCKER_BUILDKIT=0 docker build --network=host --tag ros2jazzy_jax_base:r36.4.0-cp311-pip_cache_cu126 \
--file /home/jetson/repos/jetson-containers/packages/cuda/cuda/Dockerfile.pip \
--build-arg BASE_IMAGE=ros2jazzy_jax_base:r36.4.0-cp311-build-essential \
--build-arg TAR_INDEX_URL="http://jetson.webredirect.org:8000/jp6/cu126" \
--build-arg PIP_INDEX_REPO="http://jetson.webredirect.org/jp6/cu126" \
--build-arg PIP_TRUSTED_HOSTS="jetson.webredirect.org" \
--build-arg PIP_UPLOAD_REPO="http://localhost/jp6/cu126" \
--build-arg PIP_UPLOAD_USER="jp6" \
--build-arg PIP_UPLOAD_PASS="none" \
--build-arg SCP_UPLOAD_URL="localhost:/dist/jp6/cu126" \
--build-arg SCP_UPLOAD_USER="None" \
--build-arg SCP_UPLOAD_PASS="None" \
/home/jetson/repos/jetson-containers/packages/cuda/cuda \
2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_012330/build/ros2jazzy_jax_base_r36.4.0-cp311-pip_cache_cu126.txt; exit ${PIPESTATUS[0]}

DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            BuildKit is currently disabled; enable it by removing the DOCKER_BUILDKIT=0
            environment-variable.

Sending build context to Docker daemon  49.66kB
Step 1/4 : ARG BASE_IMAGE
Step 2/4 : FROM ${BASE_IMAGE}
 ---> d87978b51f05
Step 3/4 : ARG PIP_INDEX_REPO     PIP_UPLOAD_REPO     PIP_UPLOAD_USER     PIP_UPLOAD_PASS     PIP_TRUSTED_HOSTS     TAR_INDEX_URL     SCP_UPLOAD_URL     SCP_UPLOAD_USER     SCP_UPLOAD_PASS
 ---> Using cache
 ---> 6582b01182d2
Step 4/4 : ENV TAR_INDEX_URL=${TAR_INDEX_URL}     PIP_INDEX_URL=${PIP_INDEX_REPO}     PIP_TRUSTED_HOST=${PIP_TRUSTED_HOSTS}     TWINE_REPOSITORY_URL=${PIP_UPLOAD_REPO}     TWINE_USERNAME=${PIP_UPLOAD_USER}     TWINE_PASSWORD=${PIP_UPLOAD_PASS}     SCP_UPLOAD_URL=${SCP_UPLOAD_URL}     SCP_UPLOAD_USER=${SCP_UPLOAD_USER}     SCP_UPLOAD_PASS=${SCP_UPLOAD_PASS}
 ---> Using cache
 ---> 9dd566839576
Successfully built 9dd566839576
Successfully tagged ros2jazzy_jax_base:r36.4.0-cp311-pip_cache_cu126
-- Building container ros2jazzy_jax_base:r36.4.0-cp311-cuda_12.6

DOCKER_BUILDKIT=0 docker build --network=host --tag ros2jazzy_jax_base:r36.4.0-cp311-cuda_12.6 \
--file /home/jetson/repos/jetson-containers/packages/cuda/cuda/Dockerfile \
--build-arg BASE_IMAGE=ros2jazzy_jax_base:r36.4.0-cp311-pip_cache_cu126 \
--build-arg CUDA_URL="https://developer.download.nvidia.com/compute/cuda/12.6.2/local_installers/cuda-tegra-repo-ubuntu2204-12-6-local_12.6.2-1_arm64.deb" \
--build-arg CUDA_DEB="cuda-tegra-repo-ubuntu2204-12-6-local" \
--build-arg CUDA_PACKAGES="cuda-toolkit*" \
--build-arg CUDA_ARCH_LIST="87" \
--build-arg DISTRO="ubuntu2204" \
/home/jetson/repos/jetson-containers/packages/cuda/cuda \
2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_012330/build/ros2jazzy_jax_base_r36.4.0-cp311-cuda_12.6.txt; exit ${PIPESTATUS[0]}

DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            BuildKit is currently disabled; enable it by removing the DOCKER_BUILDKIT=0
            environment-variable.

Sending build context to Docker daemon  49.66kB
Step 1/9 : ARG BASE_IMAGE
Step 2/9 : FROM ${BASE_IMAGE}
 ---> 9dd566839576
Step 3/9 : ARG CUDA_URL     CUDA_DEB     CUDA_PACKAGES     CUDA_ARCH_LIST     DISTRO="ubuntu2004"
 ---> Using cache
 ---> b99ece155c1e
Step 4/9 : COPY install.sh /tmp/install_cuda.sh
 ---> Using cache
 ---> caddd9124125
Step 5/9 : RUN /tmp/install_cuda.sh
 ---> Using cache
 ---> 5116adb3d258
Step 6/9 : ENV CUDA_HOME="/usr/local/cuda"
 ---> Using cache
 ---> da5a6f088226
Step 7/9 : ENV NVCC_PATH="$CUDA_HOME/bin/nvcc"
 ---> Using cache
 ---> 12e7f58b81a4
Step 8/9 : ENV NVIDIA_VISIBLE_DEVICES=all     NVIDIA_DRIVER_CAPABILITIES=all     CUDAARCHS=${CUDA_ARCH_LIST}     CUDA_ARCHITECTURES=${CUDA_ARCH_LIST}     CUDA_HOME="/usr/local/cuda"     CUDNN_LIB_PATH="/usr/lib/aarch64-linux-gnu"     CUDNN_LIB_INCLUDE_PATH="/usr/include"     CMAKE_CUDA_COMPILER=${NVCC_PATH}     CUDA_NVCC_EXECUTABLE=${NVCC_PATH}     CUDACXX=${NVCC_PATH}     TORCH_NVCC_FLAGS="-Xfatbin -compress-all"     CUDA_BIN_PATH="${CUDA_HOME}/bin"     CUDA_TOOLKIT_ROOT_DIR="${CUDA_HOME}"     PATH="$CUDA_HOME/bin:${PATH}"     LD_LIBRARY_PATH="${CUDA_HOME}/compat:${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}"     DEBIAN_FRONTEND=noninteractive
 ---> Using cache
 ---> 2096d1917c5b
Step 9/9 : WORKDIR /
 ---> Using cache
 ---> 7ca03698e7f1
Successfully built 7ca03698e7f1
Successfully tagged ros2jazzy_jax_base:r36.4.0-cp311-cuda_12.6
-- Testing container ros2jazzy_jax_base:r36.4.0-cp311-cuda_12.6 (cuda:12.6/test.sh)

docker run -t --rm --runtime=nvidia --network=host \
--volume /home/jetson/repos/jetson-containers/packages/cuda/cuda:/test \
--volume /home/jetson/repos/jetson-containers/data:/data \
--workdir /test \
ros2jazzy_jax_base:r36.4.0-cp311-cuda_12.6 \
/bin/bash -c '/bin/bash test.sh' \
2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_012330/test/ros2jazzy_jax_base_r36.4.0-cp311-cuda_12.6_test.sh.txt; exit ${PIPESTATUS[0]}

{
   "cuda" : {
      "name" : "CUDA SDK",
      "version" : "12.6.2"
   },
   "cuda_cccl" : {
      "name" : "CUDA C++ Core Compute Libraries",
      "version" : "12.6.77"
   },
   "cuda_compat" : {
      "name" : "CUDA Specific Libraries",
      "version" : "12.6.36890662"
   },
   "cuda_cudart" : {
      "name" : "CUDA Runtime (cudart)",
      "version" : "12.6.77"
   },
   "cuda_cuobjdump" : {
      "name" : "cuobjdump",
      "version" : "12.6.77"
   },
   "cuda_cupti" : {
      "name" : "CUPTI",
      "version" : "12.6.80"
   },
   "cuda_cuxxfilt" : {
      "name" : "CUDA cu++ filt",
      "version" : "12.6.77"
   },
   "cuda_gdb" : {
      "name" : "CUDA GDB",
      "version" : "12.6.77"
   },
   "cuda_nvcc" : {
      "name" : "CUDA NVCC",
      "version" : "12.6.77"
   },
   "cuda_nvdisasm" : {
      "name" : "CUDA nvdisasm",
      "version" : "12.6.77"
   },
   "cuda_nvml_dev" : {
      "name" : "CUDA NVML Headers",
      "version" : "12.6.77"
   },
   "cuda_nvprune" : {
      "name" : "CUDA nvprune",
      "version" : "12.6.77"
   },
   "cuda_nvrtc" : {
      "name" : "CUDA NVRTC",
      "version" : "12.6.77"
   },
   "cuda_nvtx" : {
      "name" : "CUDA NVTX",
      "version" : "12.6.77"
   },
   "cuda_sanitizer_api" : {
      "name" : "CUDA Compute Sanitizer API",
      "version" : "12.6.77"
   },
   "libcublas" : {
      "name" : "CUDA cuBLAS",
      "version" : "12.6.3.3"
   },
   "libcudla" : {
      "name" : "CUDA cuDLA",
      "version" : "12.6.77"
   },
   "libcufft" : {
      "name" : "CUDA cuFFT",
      "version" : "11.3.0.4"
   },
   "libcufile" : {
      "name" : "GPUDirect Storage (cufile)",
      "version" : "1.11.1.6"
   },
   "libcurand" : {
      "name" : "CUDA cuRAND",
      "version" : "10.3.7.77"
   },
   "libcusolver" : {
      "name" : "CUDA cuSOLVER",
      "version" : "11.7.1.2"
   },
   "libcusparse" : {
      "name" : "CUDA cuSPARSE",
      "version" : "12.5.4.2"
   },
   "libnpp" : {
      "name" : "CUDA NPP",
      "version" : "12.3.1.54"
   },
   "libnvfatbin" : {
      "name" : "Fatbin interaction library",
      "version" : "12.6.77"
   },
   "libnvjitlink" : {
      "name" : "JIT Linker Library",
      "version" : "12.6.77"
   },
   "libnvjpeg" : {
      "name" : "CUDA nvJPEG",
      "version" : "12.3.3.54"
   },
   "nsight_compute" : {
      "name" : "Nsight Compute",
      "version" : "2024.3.2.3"
   },
   "nvidia_fs" : {
      "name" : "NVIDIA file-system",
      "version" : "2.22.3"
   }
}
-- Building container ros2jazzy_jax_base:r36.4.0-cp311-cudnn_9.4

DOCKER_BUILDKIT=0 docker build --network=host --tag ros2jazzy_jax_base:r36.4.0-cp311-cudnn_9.4 \
--file /home/jetson/repos/jetson-containers/packages/cuda/cudnn/Dockerfile \
--build-arg BASE_IMAGE=ros2jazzy_jax_base:r36.4.0-cp311-cuda_12.6 \
--build-arg CUDNN_URL="https://developer.download.nvidia.com/compute/cudnn/9.4.0/local_installers/cudnn-local-tegra-repo-ubuntu2204-9.4.0_1.0-1_arm64.deb" \
--build-arg CUDNN_DEB="cudnn-local-tegra-repo-ubuntu2204-9.4.0" \
--build-arg CUDNN_PACKAGES="libcudnn*-dev libcudnn*-samples" \
/home/jetson/repos/jetson-containers/packages/cuda/cudnn \
2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_012330/build/ros2jazzy_jax_base_r36.4.0-cp311-cudnn_9.4.txt; exit ${PIPESTATUS[0]}

DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            BuildKit is currently disabled; enable it by removing the DOCKER_BUILDKIT=0
            environment-variable.

Sending build context to Docker daemon  26.62kB
Step 1/7 : ARG BASE_IMAGE
Step 2/7 : FROM ${BASE_IMAGE}
 ---> 7ca03698e7f1
Step 3/7 : ARG CUDNN_URL
 ---> Using cache
 ---> f11be134f3fe
Step 4/7 : ARG CUDNN_DEB
 ---> Using cache
 ---> 7b20471d0094
Step 5/7 : ARG CUDNN_PACKAGES
 ---> Using cache
 ---> 46fcc13d156b
Step 6/7 : RUN echo "Downloading ${CUDNN_DEB}" &&     mkdir /tmp/cudnn && cd /tmp/cudnn &&     wget --quiet --show-progress --progress=bar:force:noscroll ${CUDNN_URL} &&     dpkg -i *.deb &&     cp /var/cudnn-local-tegra-repo-*/cudnn-local-tegra-*-keyring.gpg /usr/share/keyrings/ &&     apt-get update &&     apt-cache search cudnn &&     apt-get install -y --no-install-recommends ${CUDNN_PACKAGES} &&     rm -rf /var/lib/apt/lists/* &&     apt-get clean &&     dpkg --list | grep cudnn &&     dpkg -P ${CUDNN_DEB} &&     rm -rf /tmp/cudnn
 ---> Using cache
 ---> 8897d95ac092
Step 7/7 : RUN cd /usr/src/cudnn_samples_v*/conv_sample/ &&     make -j$(nproc)
 ---> Using cache
 ---> ef1ca53b40a5
Successfully built ef1ca53b40a5
Successfully tagged ros2jazzy_jax_base:r36.4.0-cp311-cudnn_9.4
-- Testing container ros2jazzy_jax_base:r36.4.0-cp311-cudnn_9.4 (cudnn:9.4/test.sh)

docker run -t --rm --runtime=nvidia --network=host \
--volume /home/jetson/repos/jetson-containers/packages/cuda/cudnn:/test \
--volume /home/jetson/repos/jetson-containers/data:/data \
--workdir /test \
ros2jazzy_jax_base:r36.4.0-cp311-cudnn_9.4 \
/bin/bash -c '/bin/bash test.sh' \
2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_012330/test/ros2jazzy_jax_base_r36.4.0-cp311-cudnn_9.4_test.sh.txt; exit ${PIPESTATUS[0]}

#define CUDNN_MAJOR 9
#define CUDNN_MINOR 4
#define CUDNN_VERSION (CUDNN_MAJOR * 10000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#define CUDNN_MAX_SM_MAJOR_NUMBER 9
#define CUDNN_MAX_SM_MINOR_NUMBER 0
#define CUDNN_MAX_DEVICE_VERSION (CUDNN_MAX_SM_MAJOR_NUMBER * 100 + CUDNN_MAX_SM_MINOR_NUMBER * 10)
Executing: conv_sample
Using format CUDNN_TENSOR_NCHW (for INT8x4 and INT8x32 tests use CUDNN_TENSOR_NCHW_VECT_C)
Testing single precision
====USER DIMENSIONS====
input dims are 1, 32, 4, 4
filter dims are 32, 32, 1, 1
output dims are 1, 32, 4, 4
====PADDING DIMENSIONS====
padded input dims are 1, 32, 4, 4
padded filter dims are 32, 32, 1, 1
padded output dims are 1, 32, 4, 4
Testing conv
^^^^ CUDA : elapsed = 0.000690849 sec,
Test PASSED
Testing half precision (math in single precision)
====USER DIMENSIONS====
input dims are 1, 32, 4, 4
filter dims are 32, 32, 1, 1
output dims are 1, 32, 4, 4
====PADDING DIMENSIONS====
padded input dims are 1, 32, 4, 4
padded filter dims are 32, 32, 1, 1
padded output dims are 1, 32, 4, 4
Testing conv
^^^^ CUDA : elapsed = 0.0216496 sec,
Test PASSED
-- Building container ros2jazzy_jax_base:r36.4.0-cp311-python

DOCKER_BUILDKIT=0 docker build --network=host --tag ros2jazzy_jax_base:r36.4.0-cp311-python \
--file /home/jetson/repos/jetson-containers/packages/build/python/Dockerfile \
--build-arg BASE_IMAGE=ros2jazzy_jax_base:r36.4.0-cp311-cudnn_9.4 \
--build-arg PYTHON_VERSION_ARG="3.11" \
/home/jetson/repos/jetson-containers/packages/build/python \
2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_012330/build/ros2jazzy_jax_base_r36.4.0-cp311-python.txt; exit ${PIPESTATUS[0]}

DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            BuildKit is currently disabled; enable it by removing the DOCKER_BUILDKIT=0
            environment-variable.

Sending build context to Docker daemon  24.06kB
Step 1/6 : ARG BASE_IMAGE
Step 2/6 : FROM ${BASE_IMAGE}
 ---> ef1ca53b40a5
Step 3/6 : ARG PYTHON_VERSION_ARG
 ---> Using cache
 ---> 4b56afd8fdc9
Step 4/6 : ENV PYTHON_VERSION=${PYTHON_VERSION_ARG}     PIP_DISABLE_PIP_VERSION_CHECK=on     PIP_DEFAULT_TIMEOUT=100     PYTHONFAULTHANDLER=1     PYTHONUNBUFFERED=1     PYTHONIOENCODING=utf-8     PYTHONHASHSEED=random     PIP_NO_CACHE_DIR=off     PIP_CACHE_PURGE=true     PIP_ROOT_USER_ACTION=ignore     TWINE_NON_INTERACTIVE=1     DEBIAN_FRONTEND=noninteractive
 ---> Using cache
 ---> 5e2122453663
Step 5/6 : COPY install.sh /tmp/install_python.sh
 ---> Using cache
 ---> f119428db85b
Step 6/6 : RUN /tmp/install_python.sh
 ---> Using cache
 ---> 20f1d1d0b9b1
Successfully built 20f1d1d0b9b1
Successfully tagged ros2jazzy_jax_base:r36.4.0-cp311-python
-- Building container ros2jazzy_jax_base:r36.4.0-cp311-tensorrt

DOCKER_BUILDKIT=0 docker build --network=host --tag ros2jazzy_jax_base:r36.4.0-cp311-tensorrt \
--file /home/jetson/repos/jetson-containers/packages/tensorrt/Dockerfile.tar \
--build-arg BASE_IMAGE=ros2jazzy_jax_base:r36.4.0-cp311-python \
--build-arg TENSORRT_URL="https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/10.4.0/tars/TensorRT-10.4.0.26.l4t.aarch64-gnu.cuda-12.6.tar.gz" \
/home/jetson/repos/jetson-containers/packages/tensorrt \
2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_012330/build/ros2jazzy_jax_base_r36.4.0-cp311-tensorrt.txt; exit ${PIPESTATUS[0]}

DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            BuildKit is currently disabled; enable it by removing the DOCKER_BUILDKIT=0
            environment-variable.

Sending build context to Docker daemon  27.14kB
Step 1/5 : ARG BASE_IMAGE
Step 2/5 : FROM ${BASE_IMAGE}
 ---> 20f1d1d0b9b1
Step 3/5 : ARG TENSORRT_URL
 ---> Using cache
 ---> c1bbc2a043d6
Step 4/5 : RUN set -ex &&     echo "Downloading ${TENSORRT_URL}" &&     mkdir -p /tmp/tensorrt &&     cd /tmp/tensorrt &&     wget --quiet --show-progress --progress=bar:force:noscroll ${TENSORRT_URL} -O TensorRT.tar &&     tar -xvf TensorRT.tar -C /usr/src &&     mv /usr/src/TensorRT-* /usr/src/tensorrt
 ---> Using cache
 ---> 99020ef997a0
Step 5/5 : RUN cd /tmp/tensorrt &&     cp -r /usr/src/tensorrt/lib/* /usr/lib/$(uname -m)-linux-gnu/ &&     cp -r /usr/src/tensorrt/include/* /usr/include/$(uname -m)-linux-gnu/ &&     PY_VERSION=$(python3 -c 'import sys; print(f"{sys.version_info.major}{sys.version_info.minor}")') &&     pip3 install --verbose --no-cache-dir /usr/src/tensorrt/python/tensorrt-*-cp${PY_VERSION}-*.whl &&     rm -rf /tmp/tensorrt
 ---> Using cache
 ---> 25b507649d40
Successfully built 25b507649d40
Successfully tagged ros2jazzy_jax_base:r36.4.0-cp311-tensorrt
-- Testing container ros2jazzy_jax_base:r36.4.0-cp311-tensorrt (tensorrt:10.4/test.sh)

docker run -t --rm --runtime=nvidia --network=host \
--volume /home/jetson/repos/jetson-containers/packages/tensorrt:/test \
--volume /home/jetson/repos/jetson-containers/data:/data \
--workdir /test \
ros2jazzy_jax_base:r36.4.0-cp311-tensorrt \
/bin/bash -c '/bin/bash test.sh' \
2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_012330/test/ros2jazzy_jax_base_r36.4.0-cp311-tensorrt_test.sh.txt; exit ${PIPESTATUS[0]}

&&&& RUNNING TensorRT.trtexec [TensorRT v100400] [b26] # /usr/src/tensorrt/bin/trtexec --help
=== Model Options ===
  --onnx=<file>               ONNX model

=== Build Options ===
  --minShapes=spec                   Build with dynamic shapes using a profile with the min shapes provided
  --optShapes=spec                   Build with dynamic shapes using a profile with the opt shapes provided
  --maxShapes=spec                   Build with dynamic shapes using a profile with the max shapes provided
  --minShapesCalib=spec              Calibrate with dynamic shapes using a profile with the min shapes provided
  --optShapesCalib=spec              Calibrate with dynamic shapes using a profile with the opt shapes provided
  --maxShapesCalib=spec              Calibrate with dynamic shapes using a profile with the max shapes provided
                                     Note: All three of min, opt and max shapes must be supplied.
                                           However, if only opt shapes is supplied then it will be expanded so
                                           that min shapes and max shapes are set to the same values as opt shapes.
                                           Input names can be wrapped with escaped single quotes (ex: 'Input:0').
                                     Example input shapes spec: input0:1x3x256x256,input1:1x3x128x128
                                     For scalars (0-D shapes), use input0:scalar or simply input0: with nothing after the colon.
                                     Each input shape is supplied as a key-value pair where key is the input name and
                                     value is the dimensions (including the batch dimension) to be used for that input.
                                     Each key-value pair has the key and value separated using a colon (:).
                                     Multiple input shapes can be provided via comma-separated key-value pairs, and each input name can
                                     contain at most one wildcard ('*') character.
  --inputIOFormats=spec              Type and format of each of the input tensors (default = all inputs in fp32:chw)
                                     See --outputIOFormats help for the grammar of type and format list.
                                     Note: If this option is specified, please set comma-separated types and formats for all
                                           inputs following the same order as network inputs ID (even if only one input
                                           needs specifying IO format) or set the type and format once for broadcasting.
  --outputIOFormats=spec             Type and format of each of the output tensors (default = all outputs in fp32:chw)
                                     Note: If this option is specified, please set comma-separated types and formats for all
                                           outputs following the same order as network outputs ID (even if only one output
                                           needs specifying IO format) or set the type and format once for broadcasting.
                                     IO Formats: spec  ::= IOfmt[","spec]
                                                 IOfmt ::= type:fmt
                                               type  ::= "fp32"|"fp16"|"bf16"|"int32"|"int64"|"int8"|"uint8"|"bool"
                                               fmt   ::= ("chw"|"chw2"|"chw4"|"hwc8"|"chw16"|"chw32"|"dhwc8"|
                                                          "cdhw32"|"hwc"|"dla_linear"|"dla_hwc4")["+"fmt]
  --memPoolSize=poolspec             Specify the size constraints of the designated memory pool(s)
                                     Supports the following base-2 suffixes: B (Bytes), G (Gibibytes), K (Kibibytes), M (Mebibytes).
                                     If none of suffixes is appended, the defualt unit is in MiB.
                                     Note: Also accepts decimal sizes, e.g. 0.25M. Will be rounded down to the nearest integer bytes.
                                     In particular, for dlaSRAM the bytes will be rounded down to the nearest power of 2.
                                   Pool constraint: poolspec ::= poolfmt[","poolspec]
                                                      poolfmt ::= pool:size
                                                    pool ::= "workspace"|"dlaSRAM"|"dlaLocalDRAM"|"dlaGlobalDRAM"|"tacticSharedMem"
  --profilingVerbosity=mode          Specify profiling verbosity. mode ::= layer_names_only|detailed|none (default = layer_names_only).
                                     Please only assign once.
  --avgTiming=M                      Set the number of times averaged in each iteration for kernel selection (default = 8)
  --refit                            Mark the engine as refittable. This will allow the inspection of refittable layers
                                     and weights within the engine.
  --stripWeights                     Strip weights from plan. This flag works with either refit or refit with identical weights. Default
                                     to latter, but you can switch to the former by enabling both --stripWeights and --refit at the same
                                     time.
  --stripAllWeights                  Alias for combining the --refit and --stripWeights options. It marks all weights as refittable,
                                     disregarding any performance impact. Additionally, it strips all refittable weights after the
                                     engine is built.
  --weightless                       [Deprecated] this knob has been deprecated. Please use --stripWeights
  --versionCompatible, --vc          Mark the engine as version compatible. This allows the engine to be used with newer versions
                                     of TensorRT on the same host OS, as well as TensorRT's dispatch and lean runtimes.
  --pluginInstanceNorm, --pi         Set `kNATIVE_INSTANCENORM` to false in the ONNX parser. This will cause the ONNX parser to use
                                     a plugin InstanceNorm implementation over the native implementation when parsing.
  --useRuntime=runtime               TensorRT runtime to execute engine. "lean" and "dispatch" require loading VC engine and do
                                     not support building an engine.
                                           runtime::= "full"|"lean"|"dispatch"
  --leanDLLPath=<file>               External lean runtime DLL to use in version compatiable mode.
  --excludeLeanRuntime               When --versionCompatible is enabled, this flag indicates that the generated engine should
                                     not include an embedded lean runtime. If this is set, the user must explicitly specify a
                                     valid lean runtime to use when loading the engine.
  --sparsity=spec                    Control sparsity (default = disabled).
                                   Sparsity: spec ::= "disable", "enable", "force"
                                     Note: Description about each of these options is as below
                                           disable = do not enable sparse tactics in the builder (this is the default)
                                           enable  = enable sparse tactics in the builder (but these tactics will only be
                                                     considered if the weights have the right sparsity pattern)
                                           force   = enable sparse tactics in the builder and force-overwrite the weights to have
                                                     a sparsity pattern (even if you loaded a model yourself)
                                                     [Deprecated] this knob has been deprecated.
                                                     Please use <polygraphy surgeon prune> to rewrite the weights.
  --noTF32                           Disable tf32 precision (default is to enable tf32, in addition to fp32)
  --fp16                             Enable fp16 precision, in addition to fp32 (default = disabled)
  --bf16                             Enable bf16 precision, in addition to fp32 (default = disabled)
  --int8                             Enable int8 precision, in addition to fp32 (default = disabled)
  --fp8                              Enable fp8 precision, in addition to fp32 (default = disabled)
  --int4                             Enable int4 precision, in addition to fp32 (default = disabled)
  --best                             Enable all precisions to achieve the best performance (default = disabled)
  --stronglyTyped                    Create a strongly typed network. (default = disabled)
  --directIO                         Avoid reformatting at network boundaries. (default = disabled)
  --precisionConstraints=spec        Control precision constraint setting. (default = none)
                                       Precision Constraints: spec ::= "none" | "obey" | "prefer"
                                         none = no constraints
                                         prefer = meet precision constraints set by --layerPrecisions/--layerOutputTypes if possible
                                         obey = meet precision constraints set by --layerPrecisions/--layerOutputTypes or fail
                                                otherwise
  --layerPrecisions=spec             Control per-layer precision constraints. Effective only when precisionConstraints is set to
                                   "obey" or "prefer". (default = none)
                                   The specs are read left-to-right, and later ones override earlier ones. Each layer name can
                                     contain at most one wildcard ('*') character.
                                   Per-layer precision spec ::= layerPrecision[","spec]
                                                       layerPrecision ::= layerName":"precision
                                                       precision ::= "fp32"|"fp16"|"bf16"|"int32"|"int8"
  --layerOutputTypes=spec            Control per-layer output type constraints. Effective only when precisionConstraints is set to
                                   "obey" or "prefer". (default = none
                                   The specs are read left-to-right, and later ones override earlier ones. Each layer name can
                                     contain at most one wildcard ('*') character. If a layer has more than
                                   one output, then multiple types separated by "+" can be provided for this layer.
                                   Per-layer output type spec ::= layerOutputTypes[","spec]
                                                         layerOutputTypes ::= layerName":"type
                                                         type ::= "fp32"|"fp16"|"bf16"|"int32"|"int8"["+"type]
  --layerDeviceTypes=spec            Specify layer-specific device type.
                                     The specs are read left-to-right, and later ones override earlier ones. If a layer does not have
                                     a device type specified, the layer will opt for the default device type.
                                   Per-layer device type spec ::= layerDeviceTypePair[","spec]
                                                         layerDeviceTypePair ::= layerName":"deviceType
                                                           deviceType ::= "GPU"|"DLA"
  --calib=<file>                     Read INT8 calibration cache file
  --safe                             Enable build safety certified engine, if DLA is enable, --buildDLAStandalone will be specified
                                     automatically (default = disabled)
  --buildDLAStandalone               Enable build DLA standalone loadable which can be loaded by cuDLA, when this option is enabled,
                                     --allowGPUFallback is disallowed and --skipInference is enabled by default. Additionally,
                                     specifying --inputIOFormats and --outputIOFormats restricts I/O data type and memory layout
                                     (default = disabled)
  --allowGPUFallback                 When DLA is enabled, allow GPU fallback for unsupported layers (default = disabled)
  --restricted                       Enable safety scope checking with kSAFETY_SCOPE build flag
  --saveEngine=<file>                Save the serialized engine
  --loadEngine=<file>                Load a serialized engine
  --getPlanVersionOnly               Print TensorRT version when loaded plan was created. Works without deserialization of the plan.
                                     Use together with --loadEngine. Supported only for engines created with 8.6 and forward.
  --tacticSources=tactics            Specify the tactics to be used by adding (+) or removing (-) tactics from the default
                                     tactic sources (default = all available tactics).
                                     Note: Currently only cuDNN, cuBLAS, cuBLAS-LT, and edge mask convolutions are listed as optional
                                           tactics.
                                   Tactic Sources: tactics ::= [","tactic]
                                                     tactic  ::= (+|-)lib
                                                   lib     ::= "CUBLAS"|"CUBLAS_LT"|"CUDNN"|"EDGE_MASK_CONVOLUTIONS"
                                                               |"JIT_CONVOLUTIONS"
                                     For example, to disable cudnn and enable cublas: --tacticSources=-CUDNN,+CUBLAS
  --noBuilderCache                   Disable timing cache in builder (default is to enable timing cache)
  --noCompilationCache               Disable Compilation cache in builder, and the cache is part of timing cache (default is to enable compilation cache)
  --errorOnTimingCacheMiss           Emit error when a tactic being timed is not present in the timing cache (default = false)
  --timingCacheFile=<file>           Save/load the serialized global timing cache
  --preview=features                 Specify preview feature to be used by adding (+) or removing (-) preview features from the default
                                   Preview Features: features ::= [","feature]
                                                       feature  ::= (+|-)flag
                                                     flag     ::= "aliasedPluginIO1003"
                                                                  |"profileSharing0806"
  --builderOptimizationLevel         Set the builder optimization level. (default is 3)
                                     Higher level allows TensorRT to spend more building time for more optimization options.
                                     Valid values include integers from 0 to the maximum optimization level, which is currently 5.
  --maxTactics                       Set the maximum number of tactics to time when there is a choice of tactics. (default is -1)
                                     Larger number of tactics allow TensorRT to spend more building time on evaluating tactics.
                                     Default value -1 means TensorRT can decide the number of tactics based on its own heuristic.
  --hardwareCompatibilityLevel=mode  Make the engine file compatible with other GPU architectures. (default = none)
                                   Hardware Compatibility Level: mode ::= "none" | "ampere+"
                                         none = no compatibility
                                         ampere+ = compatible with Ampere and newer GPUs
  --runtimePlatform=platform         Set the target platform for runtime execution. (default = SameAsBuild)
                                     When this option is enabled, --skipInference is enabled by default.
                                   RuntimePlatfrom: platform ::= "SameAsBuild" | "WindowsAMD64"
                                         SameAsBuild = no requirement for cross-platform compatibility.
                                         WindowsAMD64 = set the target platform for engine execution as Windows AMD64 system
  --tempdir=<dir>                    Overrides the default temporary directory TensorRT will use when creating temporary files.
                                     See IRuntime::setTemporaryDirectory API documentation for more information.
  --tempfileControls=controls        Controls what TensorRT is allowed to use when creating temporary executable files.
                                     Should be a comma-separated list with entries in the format (in_memory|temporary):(allow|deny).
                                     in_memory: Controls whether TensorRT is allowed to create temporary in-memory executable files.
                                     temporary: Controls whether TensorRT is allowed to create temporary executable files in the
                                                filesystem (in the directory given by --tempdir).
                                     For example, to allow in-memory files and disallow temporary files:
                                         --tempfileControls=in_memory:allow,temporary:deny
                                     If a flag is unspecified, the default behavior is "allow".
  --maxAuxStreams=N                  Set maximum number of auxiliary streams per inference stream that TRT is allowed to use to run
                                     kernels in parallel if the network contains ops that can run in parallel, with the cost of more
                                     memory usage. Set this to 0 for optimal memory usage. (default = using heuristics)
  --profile                          Build with dynamic shapes using a profile with the min/max/opt shapes provided. Can be specified
                                         multiple times to create multiple profiles with contiguous index.
                                     (ex: --profile=0 --minShapes=<spec> --optShapes=<spec> --maxShapes=<spec> --profile=1 ...)
  --calibProfile                     Select the optimization profile to calibrate by index. (default = 0)
  --allowWeightStreaming             Enable a weight streaming engine. Must be specified with --stronglyTyped. TensorRT will disable
                                     weight streaming at runtime unless --weightStreamingBudget is specified.
  --markDebug                        Specify list of names of tensors to be marked as debug tensors. Separate names with a comma

=== Inference Options ===
  --shapes=spec               Set input shapes for dynamic shapes inference inputs.
                              Note: Input names can be wrapped with escaped single quotes (ex: 'Input:0').
                              Example input shapes spec: input0:1x3x256x256, input1:1x3x128x128
                              For scalars (0-D shapes), use input0:scalar or simply input0: with nothing after the colon.
                              Each input shape is supplied as a key-value pair where key is the input name and
                              value is the dimensions (including the batch dimension) to be used for that input.
                              Each key-value pair has the key and value separated using a colon (:).
                              Multiple input shapes can be provided via comma-separated key-value pairs, and each input
                              name can contain at most one wildcard ('*') character.
  --loadInputs=spec           Load input values from files (default = generate random inputs). Input names can be wrapped with single quotes (ex: 'Input:0')
                            Input values spec ::= Ival[","spec]
                                         Ival ::= name":"file
                              Consult the README for more information on generating files for custom inputs.
  --iterations=N              Run at least N inference iterations (default = 10)
  --warmUp=N                  Run for N milliseconds to warmup before measuring performance (default = 200)
  --duration=N                Run performance measurements for at least N seconds wallclock time (default = 3)
                              If -1 is specified, inference will keep running unless stopped manually
  --sleepTime=N               Delay inference start with a gap of N milliseconds between launch and compute (default = 0)
  --idleTime=N                Sleep N milliseconds between two continuous iterations(default = 0)
  --infStreams=N              Instantiate N execution contexts to run inference concurrently (default = 1)
  --exposeDMA                 Serialize DMA transfers to and from device (default = disabled).
  --noDataTransfers           Disable DMA transfers to and from device (default = enabled).
  --useManagedMemory          Use managed memory instead of separate host and device allocations (default = disabled).
  --useSpinWait               Actively synchronize on GPU events. This option may decrease synchronization time but increase CPU usage and power (default = disabled)
  --threads                   Enable multithreading to drive engines with independent threads or speed up refitting (default = disabled)
  --useCudaGraph              Use CUDA graph to capture engine execution and then launch inference (default = disabled).
                              This flag may be ignored if the graph capture fails.
  --timeDeserialize           Time the amount of time it takes to deserialize the network and exit.
  --timeRefit                 Time the amount of time it takes to refit the engine before inference.
  --separateProfileRun        Do not attach the profiler in the benchmark run; if profiling is enabled, a second profile run will be executed (default = disabled)
  --skipInference             Exit after the engine has been built and skip inference perf measurement (default = disabled)
  --persistentCacheRatio      Set the persistentCacheLimit in ratio, 0.5 represent half of max persistent L2 size (default = 0)
  --useProfile                Set the optimization profile for the inference context (default = 0 ).
  --allocationStrategy=spec   Specify how the internal device memory for inference is allocated.
                            Strategy: spec ::= "static", "profile", "runtime"
                                  static = Allocate device memory based on max size across all profiles.
                                  profile = Allocate device memory based on max size of the current profile.
                                  runtime = Allocate device memory based on the actual input shapes.
  --saveDebugTensors          Specify list of names of tensors to turn on the debug state
                              and filename to save raw outputs to.
                              These tensors must be specified as debug tensors during build time.
                            Input values spec ::= Ival[","spec]
                                         Ival ::= name":"file
  --weightStreamingBudget     Set the maximum amount of GPU memory TensorRT is allowed to use for weights.
                              It can take on the following values:
                                -2: (default) Disable weight streaming at runtime.
                                -1: TensorRT will automatically decide the budget.
                                 0-100%: Percentage of streamable weights that reside on the GPU.
                                         0% saves the most memory but will have the worst performance.
                                         Requires the % character.
                                >=0B: The exact amount of streamable weights that reside on the GPU. Supports the
                                     following base-2 suffixes: B (Bytes), G (Gibibytes), K (Kibibytes), M (Mebibytes).

=== Reporting Options ===
  --verbose                   Use verbose logging (default = false)
  --avgRuns=N                 Report performance measurements averaged over N consecutive iterations (default = 10)
  --percentile=P1,P2,P3,...   Report performance for the P1,P2,P3,... percentages (0<=P_i<=100, 0 representing max perf, and 100 representing min perf; (default = 90,95,99%)
  --dumpRefit                 Print the refittable layers and weights from a refittable engine
  --dumpOutput                Print the output tensor(s) of the last inference iteration (default = disabled)
  --dumpRawBindingsToFile     Print the input/output tensor(s) of the last inference iteration to file(default = disabled)
  --dumpProfile               Print profile information per layer (default = disabled)
  --dumpLayerInfo             Print layer information of the engine to console (default = disabled)
  --dumpOptimizationProfile   Print the optimization profile(s) information (default = disabled)
  --exportTimes=<file>        Write the timing results in a json file (default = disabled)
  --exportOutput=<file>       Write the output tensors to a json file (default = disabled)
  --exportProfile=<file>      Write the profile information per layer in a json file (default = disabled)
  --exportLayerInfo=<file>    Write the layer information of the engine in a json file (default = disabled)

=== System Options ===
  --device=N                  Select cuda device N (default = 0)
  --useDLACore=N              Select DLA core N for layers that support DLA (default = none)
  --staticPlugins             Plugin library (.so) to load statically (can be specified multiple times)
  --dynamicPlugins            Plugin library (.so) to load dynamically and may be serialized with the engine if they are included in --setPluginsToSerialize (can be specified multiple times)
  --setPluginsToSerialize     Plugin library (.so) to be serialized with the engine (can be specified multiple times)
  --ignoreParsedPluginLibs    By default, when building a version-compatible engine, plugin libraries specified by the ONNX parser
                              are implicitly serialized with the engine (unless --excludeLeanRuntime is specified) and loaded dynamically.
                              Enable this flag to ignore these plugin libraries instead.

=== Help ===
  --help, -h                  Print this message
TensorRT version: 10.4.0
-- Building container ros2jazzy_jax_base:r36.4.0-cp311-numpy

DOCKER_BUILDKIT=0 docker build --network=host --tag ros2jazzy_jax_base:r36.4.0-cp311-numpy \
--file /home/jetson/repos/jetson-containers/packages/numeric/numpy/Dockerfile \
--build-arg BASE_IMAGE=ros2jazzy_jax_base:r36.4.0-cp311-tensorrt \
/home/jetson/repos/jetson-containers/packages/numeric/numpy \
2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_012330/build/ros2jazzy_jax_base_r36.4.0-cp311-numpy.txt; exit ${PIPESTATUS[0]}

DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            BuildKit is currently disabled; enable it by removing the DOCKER_BUILDKIT=0
            environment-variable.

Sending build context to Docker daemon  15.87kB
Step 1/4 : ARG BASE_IMAGE
Step 2/4 : FROM ${BASE_IMAGE}
 ---> 25b507649d40
Step 3/4 : ENV OPENBLAS_CORETYPE=ARMV8
 ---> Using cache
 ---> c846d3421100
Step 4/4 : RUN pip3 install --upgrade --force-reinstall --no-cache-dir --verbose 'numpy<2' &&     pip3 show numpy && python3 -c 'import numpy; print(numpy.__version__)'
 ---> Using cache
 ---> 51b3f8e817ee
Successfully built 51b3f8e817ee
Successfully tagged ros2jazzy_jax_base:r36.4.0-cp311-numpy
-- Testing container ros2jazzy_jax_base:r36.4.0-cp311-numpy (numpy/test.py)

docker run -t --rm --runtime=nvidia --network=host \
--volume /home/jetson/repos/jetson-containers/packages/numeric/numpy:/test \
--volume /home/jetson/repos/jetson-containers/data:/data \
--workdir /test \
ros2jazzy_jax_base:r36.4.0-cp311-numpy \
/bin/bash -c 'python3 test.py' \
2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_012330/test/ros2jazzy_jax_base_r36.4.0-cp311-numpy_test.py.txt; exit ${PIPESTATUS[0]}

testing numpy...
numpy version: 1.26.4
/usr/local/lib/python3.11/dist-packages/numpy/__config__.py:155: UserWarning: Install `pyyaml` for better output
  warnings.warn("Install `pyyaml` for better output", stacklevel=1)
{
  "Compilers": {
    "c": {
      "name": "gcc",
      "linker": "ld.bfd",
      "version": "10.2.1",
      "commands": "cc",
      "args": "-fno-strict-aliasing",
      "linker args": "-Wl,--strip-debug, -fno-strict-aliasing"
    },
    "cython": {
      "name": "cython",
      "linker": "cython",
      "version": "3.0.8",
      "commands": "cython"
    },
    "c++": {
      "name": "gcc",
      "linker": "ld.bfd",
      "version": "10.2.1",
      "commands": "c++",
      "linker args": "-Wl,--strip-debug"
    }
  },
  "Machine Information": {
    "host": {
      "cpu": "aarch64",
      "family": "aarch64",
      "endian": "little",
      "system": "linux"
    },
    "build": {
      "cpu": "aarch64",
      "family": "aarch64",
      "endian": "little",
      "system": "linux"
    }
  },
  "Build Dependencies": {
    "blas": {
      "name": "openblas64",
      "found": true,
      "version": "0.3.23.dev",
      "detection method": "pkgconfig",
      "include directory": "/usr/local/include",
      "lib directory": "/usr/local/lib",
      "openblas configuration": "USE_64BITINT=1 DYNAMIC_ARCH=1 DYNAMIC_OLDER= NO_CBLAS= NO_LAPACK= NO_LAPACKE= NO_AFFINITY=1 USE_OPENMP= NEOVERSEN1 MAX_THREADS=80",
      "pc file directory": "/usr/local/lib/pkgconfig"
    },
    "lapack": {
      "name": "dep281472816241488",
      "found": true,
      "version": "1.26.4",
      "detection method": "internal",
      "include directory": "unknown",
      "lib directory": "unknown",
      "openblas configuration": "unknown",
      "pc file directory": "unknown"
    }
  },
  "Python Information": {
    "path": "/opt/python/cp311-cp311/bin/python",
    "version": "3.11"
  },
  "SIMD Extensions": {
    "baseline": [
      "NEON",
      "NEON_FP16",
      "NEON_VFPV4",
      "ASIMD"
    ],
    "found": [
      "ASIMDHP"
    ],
    "not found": [
      "ASIMDFHM"
    ]
  }
}
None
numpy OK

-- Building container ros2jazzy_jax_base:r36.4.0-cp311-opencv

DOCKER_BUILDKIT=0 docker build --network=host --tag ros2jazzy_jax_base:r36.4.0-cp311-opencv \
--file /home/jetson/repos/jetson-containers/packages/opencv/Dockerfile \
--build-arg BASE_IMAGE=ros2jazzy_jax_base:r36.4.0-cp311-numpy \
--build-arg OPENCV_VERSION="4.10.0" \
--build-arg OPENCV_PYTHON="4.x" \
--build-arg CUDA_ARCH_BIN="8.7" \
/home/jetson/repos/jetson-containers/packages/opencv \
2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_012330/build/ros2jazzy_jax_base_r36.4.0-cp311-opencv.txt; exit ${PIPESTATUS[0]}

DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            BuildKit is currently disabled; enable it by removing the DOCKER_BUILDKIT=0
            environment-variable.

Sending build context to Docker daemon   38.4kB
Step 1/6 : ARG BASE_IMAGE
Step 2/6 : FROM ${BASE_IMAGE}
 ---> 51b3f8e817ee
Step 3/6 : ARG OPENCV_VERSION     OPENCV_PYTHON     OPENCV_URL     CUDA_ARCH_BIN     FORCE_BUILD=off
 ---> Using cache
 ---> 8648d0ff99e2
Step 4/6 : ENV OPENCV_VERSION=${OPENCV_VERSION}     OPENCV_URL=${OPENCV_URL}
 ---> Using cache
 ---> 639eab04054d
Step 5/6 : COPY build.sh      install.sh      install_deps.sh      install_deb.sh      patches.diff      /tmp/opencv/
 ---> Using cache
 ---> 608369c493be
Step 6/6 : RUN cd /tmp/opencv && ./install.sh || ./build.sh || echo "BUILD FAILED (OpenCV ${OPENCV_VERSION})"
 ---> Using cache
 ---> 4dcbf4786d7b
Successfully built 4dcbf4786d7b
Successfully tagged ros2jazzy_jax_base:r36.4.0-cp311-opencv
-- Testing container ros2jazzy_jax_base:r36.4.0-cp311-opencv (opencv:4.10.0/test.py)

docker run -t --rm --runtime=nvidia --network=host \
--volume /home/jetson/repos/jetson-containers/packages/opencv:/test \
--volume /home/jetson/repos/jetson-containers/data:/data \
--workdir /test \
ros2jazzy_jax_base:r36.4.0-cp311-opencv \
/bin/bash -c 'python3 test.py' \
2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_012330/test/ros2jazzy_jax_base_r36.4.0-cp311-opencv_test.py.txt; exit ${PIPESTATUS[0]}

testing OpenCV...
Traceback (most recent call last):
  File "/test/test.py", line 4, in <module>
    import cv2
ModuleNotFoundError: No module named 'cv2'
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/jetson/repos/jetson-containers/jetson_containers/build.py", line 112, in <module>
    build_container(args.name, args.packages, args.base, args.build_flags, args.build_args, args.simulate, args.skip_tests, args.test_only, args.push, args.no_github_api, args.skip_packages)
  File "/home/jetson/repos/jetson-containers/jetson_containers/container.py", line 154, in build_container
    test_container(container_name, pkg, simulate)
  File "/home/jetson/repos/jetson-containers/jetson_containers/container.py", line 327, in test_container
    status = subprocess.run(cmd.replace(_NEWLINE_, ' '), executable='/bin/bash', shell=True, check=True)
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'docker run -t --rm --runtime=nvidia --network=host --volume /home/jetson/repos/jetson-containers/packages/opencv:/test --volume /home/jetson/repos/jetson-containers/data:/data --workdir /test ros2jazzy_jax_base:r36.4.0-cp311-opencv /bin/bash -c 'python3 test.py' 2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_012330/test/ros2jazzy_jax_base_r36.4.0-cp311-opencv_test.py.txt; exit ${PIPESTATUS[0]}' returned non-zero exit status 1.

Jtop log

--------------------- PLATFORM -------------------------
Machine: aarch64
System: Linux
Distribution: Ubuntu 22.04 Jammy Jellyfish
Release: 5.15.148-tegra
Python: 3.10.12
-------------------- JETSON RAW OUTPUT -----------------
------------------
Path: /etc/nv_tegra_release
# R36 (release), REVISION: 4.0, GCID: 37537400, BOARD: generic, EABI: aarch64, DATE: Fri Sep 13 04:36:44 UTC 2024
------------------
Path: /sys/firmware/devicetree/base/model
NVIDIA Jetson Orin Nano Developer Kit
------------------
Path: /proc/device-tree/nvidia,boardids
No such file or directory
------------------
Path: /proc/device-tree/compatible
nvidia,p3768-0000+p3767-0005nvidia,p3767-0005nvidia,tegra234
------------------
Path: /proc/device-tree/nvidia,dtsfilename
No such file or directory
------------------

GPU info

jetson@ubuntu ~/r/jetson-containers (master)> nvidia-smi
Mon Oct 28 02:11:29 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 540.4.0                Driver Version: 540.4.0      CUDA Version: 12.6     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Orin (nvgpu)                  N/A  | N/A              N/A |                  N/A |
| N/A   N/A  N/A               N/A /  N/A | Not Supported        |     N/A          N/A |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

Storage info

jetson@ubuntu ~/r/jetson-containers (master)> df -h
Filesystem       Size  Used Avail Use% Mounted on
/dev/nvme0n1p1   915G   35G  834G   5% /
tmpfs            3.8G  136K  3.8G   1% /dev/shm
tmpfs            1.5G   35M  1.5G   3% /run
tmpfs            5.0M  4.0K  5.0M   1% /run/lock
/dev/nvme0n1p10   63M  118K   63M   1% /boot/efi
tmpfs            762M   92K  762M   1% /run/user/128
tmpfs            762M   76K  762M   1% /run/user/1000

The text was updated successfully, but these errors were encountered:

0Unkn0wn · 2024-10-28T12:44:01Z

Also tried building opencv:4.8.1 to see if it would build and it failed as well.

Command:
PYTHON_VERSION=3.11 jetson-containers build --name=opencv-p311 opencv:4.8.1

Build log(trimmed it a bit so it would fit)

...

  -- General configuration for OpenCV 4.8.1 =====================================
  --   Version control:               4.8.1-dirty
  --
  --   Extra modules:
  --     Location (extra):            /opt/opencv-python/opencv_contrib/modules
  --     Version control (extra):     4.8.1
  --
  --   Platform:
  --     Timestamp:                   2024-10-28T12:21:43Z
  --     Host:                        Linux 5.15.148-tegra aarch64
  --     CMake:                       3.30.5
  --     CMake generator:             Unix Makefiles
  --     CMake build tool:            /usr/bin/gmake
  --     Configuration:               RELEASE
  --
  --   CPU/HW features:
  --     Baseline:                    NEON FP16
  --       required:                  NEON
  --
  --   C/C++:
  --     Built as dynamic libs?:      NO
  --     C++ standard:                11
  --     C++ Compiler:                /usr/bin/c++  (ver 11.4.0)
  --     C++ flags (Release):         -fsigned-char -W -Wall -Wreturn-type -Wnon-virtual-dtor -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG  -DNDEBUG
  --     C++ flags (Debug):           -fsigned-char -W -Wall -Wreturn-type -Wnon-virtual-dtor -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -fvisibility-inlines-hidden -g  -O0 -DDEBUG -D_DEBUG
  --     C Compiler:                  /usr/bin/cc
  --     C flags (Release):           -fsigned-char -W -Wall -Wreturn-type -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -O3 -DNDEBUG  -DNDEBUG
  --     C flags (Debug):             -fsigned-char -W -Wall -Wreturn-type -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -g  -O0 -DDEBUG -D_DEBUG
  --     Linker flags (Release):      -Wl,--gc-sections -Wl,--as-needed -Wl,--no-undefined
  --     Linker flags (Debug):        -Wl,--gc-sections -Wl,--as-needed -Wl,--no-undefined
  --     ccache:                      NO
  --     Precompiled headers:         NO
  --     Extra dependencies:          /usr/lib/aarch64-linux-gnu/liblapack.so /usr/lib/aarch64-linux-gnu/libcblas.so /usr/lib/aarch64-linux-gnu/libatlas.so /usr/lib/aarch64-linux-gnu/libjpeg.so /usr/lib/aarch64-linux-gnu/libpng.so /usr/lib/aarch64-linux-gnu/libz.so Iconv::Iconv m pthread cudart_static dl rt nppc nppial nppicc nppidei nppif nppig nppim nppist nppisu nppitc npps cublas cudnn cufft -L/usr/local/cuda/lib64 -L/usr/lib/aarch64-linux-gnu
  --     3rdparty dependencies:       libprotobuf ade ittnotify libwebp libtiff libopenjp2 IlmImf quirc tegra_hal
  --
  --   OpenCV modules:
  --     To be built:                 alphamat aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann fuzzy gapi hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency shape stereo stitching structured_light superres surface_matching text tracking video videoio videostab wechat_qrcode xfeatures2d ximgproc xobjdetect xphoto
  --     Disabled:                    freetype world
  --     Disabled by dependency:      -
  --     Unavailable:                 cvv hdf java julia matlab ovis python2 sfm ts viz
  --     Applications:                -
  --     Documentation:               NO
  --     Non-free algorithms:         YES
  --
  --   GUI:                           GTK3
  --     GTK+:                        YES (ver 3.24.33)
  --       GThread :                  YES (ver 2.72.4)
  --       GtkGlExt:                  NO
  --     VTK support:                 NO
  --
  --   Media I/O:
  --     ZLib:                        /usr/lib/aarch64-linux-gnu/libz.so (ver 1.2.11)
  --     JPEG:                        /usr/lib/aarch64-linux-gnu/libjpeg.so (ver 80)
  --     WEBP:                        build (ver encoder: 0x020f)
  --     PNG:                         /usr/lib/aarch64-linux-gnu/libpng.so (ver 1.6.37)
  --     TIFF:                        build (ver 42 - 4.2.0)
  --     JPEG 2000:                   build (ver 2.5.0)
  --     OpenEXR:                     build (ver 2.3.0)
  --     HDR:                         YES
  --     SUNRASTER:                   YES
  --     PXM:                         YES
  --     PFM:                         YES
  --
  --   Video I/O:
  --     DC1394:                      NO
  --     FFMPEG:                      YES
  --       avcodec:                   YES (58.134.100)
  --       avformat:                  YES (58.76.100)
  --       avutil:                    YES (56.70.100)
  --       swscale:                   YES (5.9.100)
  --       avresample:                NO
  --     GStreamer:                   YES (1.20.3)
  --     v4l/v4l2:                    YES (linux/videodev2.h)
  --
  --   Parallel framework:            TBB (ver 2021.5 interface 12050)
  --
  --   Trace:                         YES (with Intel ITT)
  --
  --   Other third-party libraries:
  --     Lapack:                      YES (/usr/lib/aarch64-linux-gnu/liblapack.so /usr/lib/aarch64-linux-gnu/libcblas.so /usr/lib/aarch64-linux-gnu/libatlas.so)
  --     Eigen:                       YES (ver 3.4.0)
  --     Custom HAL:                  YES (carotene (ver 0.0.1))
  --     Protobuf:                    build (3.19.1)
  --     Flatbuffers:                 builtin/3rdparty (23.5.9)
  --
  --   NVIDIA CUDA:                   YES (ver 12.6, CUFFT CUBLAS FAST_MATH)
  --     NVIDIA GPU arch:             87
  --     NVIDIA PTX archs:
  --
  --   cuDNN:                         YES (ver 9.4.0)
  --
  --   Python 3:
  --     Interpreter:                 /usr/bin/python3.11 (ver 3.11)
  --     Libraries:                   /usr/lib/aarch64-linux-gnu/libpython3.11.so (ver 3.11.0rc1)
  --     numpy:                       /tmp/pip-build-env-8g0rej1b/overlay/local/lib/python3.11/dist-packages/numpy/_core/include (ver 2.1.2)
  --     install path:                python/cv2/python-3
  --
  --   Python (for build):            /usr/bin/python3.11
  --
  --   Java:
  --     ant:                         NO
  --     Java:                        NO
  --     JNI:                         NO
  --     Java wrappers:               NO
  --     Java tests:                  NO
  --
  --   Install to:                    /opt/opencv-python/_skbuild/linux-aarch64-3.11/cmake-install
  -- -----------------------------------------------------------------
  --
  -- Configuring done (43.2s)
  -- Generating done (1.9s)
  -- Build files have been written to: /opt/opencv-python/_skbuild/linux-aarch64-3.11/cmake-build
  [  0%] Built target opencv_highgui_plugins
  [  0%] Generate opencv4.pc
  [  0%] Built target opencv_dnn_plugins
  [  0%] Building C object 3rdparty/openjpeg/openjp2/CMakeFiles/libopenjp2.dir/thread.c.o
  [  0%] Built target opencv_videoio_plugins
  [  0%] Building CXX object 3rdparty/carotene/hal/carotene/CMakeFiles/carotene_objs.dir/src/absdiff.cpp.o
  CMake Deprecation Warning at /opt/opencv-python/opencv/cmake/OpenCVGenPkgconfig.cmake:113 (cmake_minimum_required):
    Compatibility with CMake < 3.5 will be removed from a future version of
    CMake.

    Update the VERSION argument <min> value or use a ...<max> suffix to tell
    CMake that the project does not need compatibility with older versions.


 
  [  4%] Building C object 3rdparty/libwebp/CMakeFiles/libwebp.dir/sharpyuv/sharpyuv_neon.c.o
  In file included from /usr/include/string.h:535,
                   from /opt/opencv-python/opencv/3rdparty/protobuf/src/google/protobuf/stubs/port.h:39,
                   from /opt/opencv-python/opencv/3rdparty/protobuf/src/google/protobuf/stubs/common.h:48,
                   from /opt/opencv-python/opencv/3rdparty/protobuf/src/google/protobuf/message_lite.h:45,
                   from /opt/opencv-python/opencv/3rdparty/protobuf/src/google/protobuf/message_lite.cc:36:
  In function ‘void* memcpy(void*, const void*, size_t)’,
      inlined from ‘uint8_t* google::protobuf::io::EpsCopyOutputStream::WriteRaw(const void*, int, uint8_t*)’ at /opt/opencv-python/opencv/3rdparty/protobuf/src/google/protobuf/io/coded_stream.h:706:16,
      inlined from ‘virtual uint8_t* google::protobuf::internal::ImplicitWeakMessage::_InternalSerialize(uint8_t*, google::protobuf::io::EpsCopyOutputStream*) const’ at /opt/opencv-python/opencv/3rdparty/protobuf/src/google/protobuf/implicit_weak_message.h:84:28,
      inlined from ‘bool google::protobuf::MessageLite::SerializePartialToZeroCopyStream(google::protobuf::io::ZeroCopyOutputStream*) const’ at /opt/opencv-python/opencv/3rdparty/protobuf/src/google/protobuf/message_lite.cc:412:30,
      inlined from ‘bool google::protobuf::MessageLite::SerializeToZeroCopyStream(google::protobuf::io::ZeroCopyOutputStream*) const’ at /opt/opencv-python/opencv/3rdparty/protobuf/src/google/protobuf/message_lite.cc:396:42:
  /usr/include/aarch64-linux-gnu/bits/string_fortified.h:29:33: warning: ‘void* __builtin___memcpy_chk(void*, const void*, long unsigned int, long unsigned int)’ specified size between 18446744071562067968 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
     29 |   return __builtin___memcpy_chk (__dest, __src, __len,
        |          ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
     30 |                                  __glibc_objsize0 (__dest));
        |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~
  In function ‘void* memcpy(void*, const void*, size_t)’,
      inlined from ‘uint8_t* google::protobuf::io::EpsCopyOutputStream::WriteRaw(const void*, int, uint8_t*)’ at /opt/opencv-python/opencv/3rdparty/protobuf/src/google/protobuf/io/coded_stream.h:706:16,
      inlined from ‘virtual uint8_t* google::protobuf::internal::ImplicitWeakMessage::_InternalSerialize(uint8_t*, google::protobuf::io::EpsCopyOutputStream*) const’ at /opt/opencv-python/opencv/3rdparty/protobuf/src/google/protobuf/implicit_weak_message.h:84:28,
      inlined from ‘bool google::protobuf::MessageLite::SerializePartialToZeroCopyStream(google::protobuf::io::ZeroCopyOutputStream*) const’ at /opt/opencv-python/opencv/3rdparty/protobuf/src/google/protobuf/message_lite.cc:412:30:
  /usr/include/aarch64-linux-gnu/bits/string_fortified.h:29:33: warning: ‘void* __builtin___memcpy_chk(void*, const void*, long unsigned int, long unsigned int)’ specified size between 18446744071562067968 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
     29 |   return __builtin___memcpy_chk (__dest, __src, __len,
        |          ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
     30 |                                  __glibc_objsize0 (__dest));
        |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~
  [  4%] Building C object 3rdparty/libtiff/CMakeFiles/libtiff.dir/tif_lzma.c.o
  [  4%] Building C object 3rdparty/libtiff/CMakeFiles/libtiff.dir/tif_lzw.c.o
  [  4%] Building CXX object 3rdparty/protobuf/CMakeFiles/libprotobuf.dir/src/google/protobuf/parse_context.cc.o
  [  4%] Building C object 3rdparty/libwebp/CMakeFiles/libwebp.dir/sharpyuv/sharpyuv_sse2.c.o
  
  [ 17%] Building CXX object 3rdparty/openexr/CMakeFiles/IlmImf.dir/IlmImf/ImfFloatVectorAttribute.cpp.o
  [ 17%] Building CXX object 3rdparty/openexr/CMakeFiles/IlmImf.dir/IlmImf/ImfFrameBuffer.cpp.o
  Note: Class cv::Feature2D has more than 1 base class (not supported by Python C extensions)
        Bases:  cv::Algorithm, cv::class, cv::Feature2D, cv::Algorithm
        Only the first base class will be used
  [ 18%] Building CXX object 3rdparty/openexr/CMakeFiles/IlmImf.dir/IlmImf/ImfFramesPerSecond.cpp.o
  [ 18%] Building CXX object 3rdparty/openexr/CMakeFiles/IlmImf.dir/IlmImf/ImfGenericInputFile.cpp.o
  [ 18%] Building CXX object 3rdparty/openexr/CMakeFiles/IlmImf.dir/IlmImf/ImfGenericOutputFile.cpp.o
  [ 18%] Building CXX object 3rdparty/openexr/CMakeFiles/IlmImf.dir/IlmImf/ImfHeader.cpp.o
  [ 18%] Building CXX object 3rdparty/openexr/CMakeFiles/IlmImf.dir/IlmImf/ImfHuf.cpp.o
  Note: Class cv::detail::GraphCutSeamFinder has more than 1 base class (not supported by Python C extensions)
        Bases:  cv::detail::GraphCutSeamFinderBase, cv::detail::SeamFinder
        Only the first base class will be used
  [ 18%] Building CXX object 3rdparty/openexr/CMakeFiles/IlmImf.dir/IlmImf/ImfIO.cpp.o
  [ 18%] Building CXX object 3rdparty/protobuf/CMakeFiles/libprotobuf.dir/src/google/protobuf/descriptor.pb.cc.o
  [ 18%] Building CXX object 3rdparty/openexr/CMakeFiles/IlmImf.dir/IlmImf/ImfInputFile.cpp.o
  [ 18%] Building CXX object 3rdparty/openexr/CMakeFiles/IlmImf.dir/IlmImf/ImfInputPart.cpp.o
  [ 18%] Building CXX object 3rdparty/openexr/CMakeFiles/IlmImf.dir/IlmImf/ImfInputPartData.cpp.o
  /opt/opencv-python/opencv/modules/python/src2/typing_stubs_generator.py:52: UserWarning: Typing stubs generation has failed.
  Traceback (most recent call last):
    File "/opt/opencv-python/opencv/modules/python/src2/typing_stubs_generator.py", line 49, in wrapped_func
      ret_type = func(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^
    File "/opt/opencv-python/opencv/modules/python/src2/typing_stubs_generator.py", line 148, in _generate
      generate_typing_stubs(self.cv_root, output_path)
    File "/opt/opencv-python/opencv/modules/python/src2/typing_stubs_generation/generation.py", line 86, in generate_typing_stubs
      root.resolve_type_nodes()
    File "/opt/opencv-python/opencv/modules/python/src2/typing_stubs_generation/nodes/namespace_node.py", line 106, in resolve_type_nodes
      raise TypeResolutionError(
  typing_stubs_generation.nodes.type_node.TypeResolutionError: Failed to resolve "cv2" namespace against "None". Errors: ['Failed to resolve "cv2.cuda" namespace against "cv2". Errors: [\'Failed to resolve "cv2.cuda.NvidiaOpticalFlow_1_0" class against "cv2". Errors: [\\\'Failed to resolve "cv2.cuda.NvidiaOpticalFlow_1_0.create" function against "cv2". Errors: [0]: Failed to resolve "perfPreset" argument: Failed to resolve "cuda_NvidiaOpticalFlow_1_0_NVIDIA_OF_PERF_LEVEL" exposed as "cuda_NvidiaOpticalFlow_1_0_NVIDIA_OF_PERF_LEVEL"\\\']\', \'Failed to resolve "cv2.cuda.NvidiaOpticalFlow_2_0" class against "cv2". Errors: [\\\'Failed to resolve "cv2.cuda.NvidiaOpticalFlow_2_0.create" function against "cv2". Errors: [0]: Failed to resolve "perfPreset" argument: Failed to resolve "cuda_NvidiaOpticalFlow_2_0_NVIDIA_OF_PERF_LEVEL" exposed as "cuda_NvidiaOpticalFlow_2_0_NVIDIA_OF_PERF_LEVEL", [1]: Failed to resolve "outputGridSize" argument: Failed to resolve "cuda_NvidiaOpticalFlow_2_0_NVIDIA_OF_OUTPUT_VECTOR_GRID_SIZE" exposed as "cuda_NvidiaOpticalFlow_2_0_NVIDIA_OF_OUTPUT_VECTOR_GRID_SIZE", [2]: Failed to resolve "hintGridSize" argument: Failed to resolve "cuda_NvidiaOpticalFlow_2_0_NVIDIA_OF_HINT_VECTOR_GRID_SIZE" exposed as "cuda_NvidiaOpticalFlow_2_0_NVIDIA_OF_HINT_VECTOR_GRID_SIZE", [3]: Failed to resolve "perfPreset" argument: Failed to resolve "cuda_NvidiaOpticalFlow_2_0_NVIDIA_OF_PERF_LEVEL" exposed as "cuda_NvidiaOpticalFlow_2_0_NVIDIA_OF_PERF_LEVEL", [4]: Failed to resolve "outputGridSize" argument: Failed to resolve "cuda_NvidiaOpticalFlow_2_0_NVIDIA_OF_OUTPUT_VECTOR_GRID_SIZE" exposed as "cuda_NvidiaOpticalFlow_2_0_NVIDIA_OF_OUTPUT_VECTOR_GRID_SIZE", [5]: Failed to resolve "hintGridSize" argument: Failed to resolve "cuda_NvidiaOpticalFlow_2_0_NVIDIA_OF_HINT_VECTOR_GRID_SIZE" exposed as "cuda_NvidiaOpticalFlow_2_0_NVIDIA_OF_HINT_VECTOR_GRID_SIZE"\\\']\']', 'Failed to resolve "cv2.cudacodec" namespace against "cv2". Errors: [\'Failed to resolve "cv2.cudacodec.createVideoWriter" function against "cv2". Errors: [0]: Failed to resolve "stream" argument: Failed to resolve "Stream" exposed as "Stream", [1]: Failed to resolve "stream" argument: Failed to resolve "Stream" exposed as "Stream"\', \'Failed to resolve "cv2.cudacodec.EncodeQp" class against "cv2". Errors: [\\\'Failed to resolve "qpInterP" property\\\', \\\'Failed to resolve "qpInterB" property\\\', \\\'Failed to resolve "qpIntra" property\\\']\', \'Failed to resolve "cv2.cudacodec.EncoderParams" class against "cv2". Errors: [\\\'Failed to resolve "targetQuality" property\\\']\', \'Failed to resolve "cv2.cudacodec.VideoReader" class against "cv2". Errors: [\\\'Failed to resolve "cv2.cudacodec.VideoReader.nextFrame" function against "cv2". Errors: [0]: Failed to resolve "frame" argument: Failed to resolve one of "GpuMat | None" items. Errors: [\\\\\\\'Failed to resolve "GpuMat" exposed as "GpuMat"\\\\\\\'], [1]: Failed to resolve "stream" argument: Failed to resolve "Stream" exposed as "Stream", [2]: Failed to resolve return type: Failed to resolve one of "tuple[bool, GpuMat]" items. Errors: [\\\\\\\'Failed to resolve "GpuMat" exposed as "GpuMat"\\\\\\\']\\\', \\\'Failed to resolve "cv2.cudacodec.VideoReader.grab" function against "cv2". Errors: [0]: Failed to resolve "stream" argument: Failed to resolve "Stream" exposed as "Stream"\\\', \\\'Failed to resolve "cv2.cudacodec.VideoReader.retrieve" function against "cv2". Errors: [0]: Failed to resolve "frame" argument: Failed to resolve one of "GpuMat | None" items. Errors: [\\\\\\\'Failed to resolve "GpuMat" exposed as "GpuMat"\\\\\\\'], [1]: Failed to resolve return type: Failed to resolve one of "tuple[bool, GpuMat]" items. Errors: [\\\\\\\'Failed to resolve "GpuMat" exposed as "GpuMat"\\\\\\\']\\\']\']']

    warnings.warn(
  [ 18%] Built target gen_opencv_python_source

  [ 36%] Linking CXX static library ../../lib/libopencv_imgproc.a
  [ 36%] Built target opencv_imgproc
  [ 36%] Building NVCC (Device) object modules/cudaarithm/CMakeFiles/cuda_compile_1.dir/src/cuda/cuda_compile_1_generated_mul_scalar.cu.o
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/reduce.hpp(379): error: no instance of overloaded function "cv::cudev::blockReduce" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::minimum<int>, cv::cudev::maximum<int>>)
                blockReduce<BLOCK_SIZE>(smem_tuple(sminval, smaxval), tie(mymin, mymax), tid, make_tuple(minOp, maxOp));
                ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(72): note #3327-D: candidate function template "cv::cudev::blockReduce<N,P0,P1,P2,P3,P4,P5,P6,P7,P8,P9,R0,R1,R2,R3,R4,R5,R6,R7,R8,R9,Op0,Op1,Op2,Op3,Op4,Op5,Op6,Op7,Op8,Op9>(const thrust::THRUST_200500_870_NS::tuple<P0, P1, P2, P3, P4, P5, P6, P7, P8, P9> &, const thrust::THRUST_200500_870_NS::tuple<R0, R1, R2, R3, R4, R5, R6, R7, R8, R9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Op0, Op1, Op2, Op3, Op4, Op5, Op6, Op7, Op8, Op9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduce(const tuple<P0, P1, P2, P3, P4, P5, P6, P7, P8, P9>& smem,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(63): note #3327-D: candidate function template "cv::cudev::blockReduce<N,T,Op>(volatile T *, T &, uint, const Op &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduce(volatile T* smem, T& val, uint tid, const Op& op)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, src_type, work_type>::reduceGrid<BLOCK_SIZE>(work_type *, int) [with src_type=uchar, work_type=int, BLOCK_SIZE=256]" at line 412
              instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,BLOCK_SIZE,PATCH_X,PATCH_Y,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, MaskPtr, int, int) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, uchar, int>, BLOCK_SIZE=256, PATCH_X=4, PATCH_Y=4, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 421
              instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, uchar, int>, Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 460
              instantiation of "void cv::cudev::grid_reduce_detail::minMaxVal<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 206 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridFindMinMaxVal_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=int]" at line 349 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridFindMinMaxVal(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=int]" at line 68 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu
              instantiation of "void <unnamed>::minMaxImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=uchar, R=int]" at line 92 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/reduce.hpp(379): error: no instance of overloaded function "cv::cudev::blockReduce" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::minimum<int>, cv::cudev::maximum<int>>)
                blockReduce<BLOCK_SIZE>(smem_tuple(sminval, smaxval), tie(mymin, mymax), tid, make_tuple(minOp, maxOp));
                ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(72): note #3327-D: candidate function template "cv::cudev::blockReduce<N,P0,P1,P2,P3,P4,P5,P6,P7,P8,P9,R0,R1,R2,R3,R4,R5,R6,R7,R8,R9,Op0,Op1,Op2,Op3,Op4,Op5,Op6,Op7,Op8,Op9>(const thrust::THRUST_200500_870_NS::tuple<P0, P1, P2, P3, P4, P5, P6, P7, P8, P9> &, const thrust::THRUST_200500_870_NS::tuple<R0, R1, R2, R3, R4, R5, R6, R7, R8, R9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Op0, Op1, Op2, Op3, Op4, Op5, Op6, Op7, Op8, Op9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduce(const tuple<P0, P1, P2, P3, P4, P5, P6, P7, P8, P9>& smem,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(63): note #3327-D: candidate function template "cv::cudev::blockReduce<N,T,Op>(volatile T *, T &, uint, const Op &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduce(volatile T* smem, T& val, uint tid, const Op& op)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, src_type, work_type>::reduceGrid<BLOCK_SIZE>(work_type *, int) [with src_type=schar, work_type=int, BLOCK_SIZE=256]" at line 412
              instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,BLOCK_SIZE,PATCH_X,PATCH_Y,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, MaskPtr, int, int) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, schar, int>, BLOCK_SIZE=256, PATCH_X=4, PATCH_Y=4, SrcPtr=cv::cudev::GlobPtr<schar>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 421
              instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, schar, int>, Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<schar>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 460
              instantiation of "void cv::cudev::grid_reduce_detail::minMaxVal<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<schar>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 206 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridFindMinMaxVal_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<schar>, ResType=int]" at line 349 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridFindMinMaxVal(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<schar>, ResType=int]" at line 68 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu
              instantiation of "void <unnamed>::minMaxImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=schar, R=int]" at line 93 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/reduce.hpp(379): error: no instance of overloaded function "cv::cudev::blockReduce" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::minimum<int>, cv::cudev::maximum<int>>)
                blockReduce<BLOCK_SIZE>(smem_tuple(sminval, smaxval), tie(mymin, mymax), tid, make_tuple(minOp, maxOp));
                ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(72): note #3327-D: candidate function template "cv::cudev::blockReduce<N,P0,P1,P2,P3,P4,P5,P6,P7,P8,P9,R0,R1,R2,R3,R4,R5,R6,R7,R8,R9,Op0,Op1,Op2,Op3,Op4,Op5,Op6,Op7,Op8,Op9>(const thrust::THRUST_200500_870_NS::tuple<P0, P1, P2, P3, P4, P5, P6, P7, P8, P9> &, const thrust::THRUST_200500_870_NS::tuple<R0, R1, R2, R3, R4, R5, R6, R7, R8, R9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Op0, Op1, Op2, Op3, Op4, Op5, Op6, Op7, Op8, Op9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduce(const tuple<P0, P1, P2, P3, P4, P5, P6, P7, P8, P9>& smem,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(63): note #3327-D: candidate function template "cv::cudev::blockReduce<N,T,Op>(volatile T *, T &, uint, const Op &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduce(volatile T* smem, T& val, uint tid, const Op& op)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, src_type, work_type>::reduceGrid<BLOCK_SIZE>(work_type *, int) [with src_type=ushort, work_type=int, BLOCK_SIZE=256]" at line 412
              instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,BLOCK_SIZE,PATCH_X,PATCH_Y,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, MaskPtr, int, int) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, ushort, int>, BLOCK_SIZE=256, PATCH_X=4, PATCH_Y=4, SrcPtr=cv::cudev::GlobPtr<ushort>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 421
              instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, ushort, int>, Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<ushort>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 460
              instantiation of "void cv::cudev::grid_reduce_detail::minMaxVal<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<ushort>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 206 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridFindMinMaxVal_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<ushort>, ResType=int]" at line 349 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridFindMinMaxVal(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<ushort>, ResType=int]" at line 68 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu
              instantiation of "void <unnamed>::minMaxImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=ushort, R=int]" at line 94 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/reduce.hpp(379): error: no instance of overloaded function "cv::cudev::blockReduce" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::minimum<int>, cv::cudev::maximum<int>>)
                blockReduce<BLOCK_SIZE>(smem_tuple(sminval, smaxval), tie(mymin, mymax), tid, make_tuple(minOp, maxOp));
                ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(72): note #3327-D: candidate function template "cv::cudev::blockReduce<N,P0,P1,P2,P3,P4,P5,P6,P7,P8,P9,R0,R1,R2,R3,R4,R5,R6,R7,R8,R9,Op0,Op1,Op2,Op3,Op4,Op5,Op6,Op7,Op8,Op9>(const thrust::THRUST_200500_870_NS::tuple<P0, P1, P2, P3, P4, P5, P6, P7, P8, P9> &, const thrust::THRUST_200500_870_NS::tuple<R0, R1, R2, R3, R4, R5, R6, R7, R8, R9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Op0, Op1, Op2, Op3, Op4, Op5, Op6, Op7, Op8, Op9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduce(const tuple<P0, P1, P2, P3, P4, P5, P6, P7, P8, P9>& smem,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(63): note #3327-D: candidate function template "cv::cudev::blockReduce<N,T,Op>(volatile T *, T &, uint, const Op &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduce(volatile T* smem, T& val, uint tid, const Op& op)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, src_type, work_type>::reduceGrid<BLOCK_SIZE>(work_type *, int) [with src_type=short, work_type=int, BLOCK_SIZE=256]" at line 412
              instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,BLOCK_SIZE,PATCH_X,PATCH_Y,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, MaskPtr, int, int) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, short, int>, BLOCK_SIZE=256, PATCH_X=4, PATCH_Y=4, SrcPtr=cv::cudev::GlobPtr<short>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 421
              instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, short, int>, Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<short>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 460
              instantiation of "void cv::cudev::grid_reduce_detail::minMaxVal<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<short>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 206 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridFindMinMaxVal_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<short>, ResType=int]" at line 349 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridFindMinMaxVal(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<short>, ResType=int]" at line 68 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu
              instantiation of "void <unnamed>::minMaxImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=short, R=int]" at line 95 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/reduce.hpp(379): error: no instance of overloaded function "cv::cudev::blockReduce" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::minimum<int>, cv::cudev::maximum<int>>)
                blockReduce<BLOCK_SIZE>(smem_tuple(sminval, smaxval), tie(mymin, mymax), tid, make_tuple(minOp, maxOp));
                ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(72): note #3327-D: candidate function template "cv::cudev::blockReduce<N,P0,P1,P2,P3,P4,P5,P6,P7,P8,P9,R0,R1,R2,R3,R4,R5,R6,R7,R8,R9,Op0,Op1,Op2,Op3,Op4,Op5,Op6,Op7,Op8,Op9>(const thrust::THRUST_200500_870_NS::tuple<P0, P1, P2, P3, P4, P5, P6, P7, P8, P9> &, const thrust::THRUST_200500_870_NS::tuple<R0, R1, R2, R3, R4, R5, R6, R7, R8, R9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Op0, Op1, Op2, Op3, Op4, Op5, Op6, Op7, Op8, Op9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduce(const tuple<P0, P1, P2, P3, P4, P5, P6, P7, P8, P9>& smem,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(63): note #3327-D: candidate function template "cv::cudev::blockReduce<N,T,Op>(volatile T *, T &, uint, const Op &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduce(volatile T* smem, T& val, uint tid, const Op& op)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, src_type, work_type>::reduceGrid<BLOCK_SIZE>(work_type *, int) [with src_type=int, work_type=int, BLOCK_SIZE=256]" at line 412
              instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,BLOCK_SIZE,PATCH_X,PATCH_Y,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, MaskPtr, int, int) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, int, int>, BLOCK_SIZE=256, PATCH_X=4, PATCH_Y=4, SrcPtr=cv::cudev::GlobPtr<int>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 421
              instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, int, int>, Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<int>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 460
              instantiation of "void cv::cudev::grid_reduce_detail::minMaxVal<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<int>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 206 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridFindMinMaxVal_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<int>, ResType=int]" at line 349 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridFindMinMaxVal(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<int>, ResType=int]" at line 68 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu
              instantiation of "void <unnamed>::minMaxImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=int, R=int]" at line 96 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/reduce.hpp(379): error: no instance of overloaded function "cv::cudev::blockReduce" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile float *, volatile float *>, thrust::THRUST_200500_870_NS::tuple<float &, float &>, int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::minimum<float>, cv::cudev::maximum<float>>)
                blockReduce<BLOCK_SIZE>(smem_tuple(sminval, smaxval), tie(mymin, mymax), tid, make_tuple(minOp, maxOp));
                ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(72): note #3327-D: candidate function template "cv::cudev::blockReduce<N,P0,P1,P2,P3,P4,P5,P6,P7,P8,P9,R0,R1,R2,R3,R4,R5,R6,R7,R8,R9,Op0,Op1,Op2,Op3,Op4,Op5,Op6,Op7,Op8,Op9>(const thrust::THRUST_200500_870_NS::tuple<P0, P1, P2, P3, P4, P5, P6, P7, P8, P9> &, const thrust::THRUST_200500_870_NS::tuple<R0, R1, R2, R3, R4, R5, R6, R7, R8, R9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Op0, Op1, Op2, Op3, Op4, Op5, Op6, Op7, Op8, Op9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduce(const tuple<P0, P1, P2, P3, P4, P5, P6, P7, P8, P9>& smem,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(63): note #3327-D: candidate function template "cv::cudev::blockReduce<N,T,Op>(volatile T *, T &, uint, const Op &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduce(volatile T* smem, T& val, uint tid, const Op& op)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, src_type, work_type>::reduceGrid<BLOCK_SIZE>(work_type *, int) [with src_type=float, work_type=float, BLOCK_SIZE=256]" at line 412
              instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,BLOCK_SIZE,PATCH_X,PATCH_Y,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, MaskPtr, int, int) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, float, float>, BLOCK_SIZE=256, PATCH_X=4, PATCH_Y=4, SrcPtr=cv::cudev::GlobPtr<float>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 421
              instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, float, float>, Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<float>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 460
              instantiation of "void cv::cudev::grid_reduce_detail::minMaxVal<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<float>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 206 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridFindMinMaxVal_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<float>, ResType=float]" at line 349 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridFindMinMaxVal(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<float>, ResType=float]" at line 68 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu
              instantiation of "void <unnamed>::minMaxImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=float, R=float]" at line 97 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/reduce.hpp(379): error: no instance of overloaded function "cv::cudev::blockReduce" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile double *, volatile double *>, thrust::THRUST_200500_870_NS::tuple<double &, double &>, int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::minimum<double>, cv::cudev::maximum<double>>)
                blockReduce<BLOCK_SIZE>(smem_tuple(sminval, smaxval), tie(mymin, mymax), tid, make_tuple(minOp, maxOp));
                ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(72): note #3327-D: candidate function template "cv::cudev::blockReduce<N,P0,P1,P2,P3,P4,P5,P6,P7,P8,P9,R0,R1,R2,R3,R4,R5,R6,R7,R8,R9,Op0,Op1,Op2,Op3,Op4,Op5,Op6,Op7,Op8,Op9>(const thrust::THRUST_200500_870_NS::tuple<P0, P1, P2, P3, P4, P5, P6, P7, P8, P9> &, const thrust::THRUST_200500_870_NS::tuple<R0, R1, R2, R3, R4, R5, R6, R7, R8, R9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Op0, Op1, Op2, Op3, Op4, Op5, Op6, Op7, Op8, Op9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduce(const tuple<P0, P1, P2, P3, P4, P5, P6, P7, P8, P9>& smem,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(63): note #3327-D: candidate function template "cv::cudev::blockReduce<N,T,Op>(volatile T *, T &, uint, const Op &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduce(volatile T* smem, T& val, uint tid, const Op& op)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, src_type, work_type>::reduceGrid<BLOCK_SIZE>(work_type *, int) [with src_type=double, work_type=double, BLOCK_SIZE=256]" at line 412
              instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,BLOCK_SIZE,PATCH_X,PATCH_Y,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, MaskPtr, int, int) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, double, double>, BLOCK_SIZE=256, PATCH_X=4, PATCH_Y=4, SrcPtr=cv::cudev::GlobPtr<double>, ResType=double, MaskPtr=cv::cudev::WithOutMask]" at line 421
              instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, double, double>, Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<double>, ResType=double, MaskPtr=cv::cudev::WithOutMask]" at line 460
              instantiation of "void cv::cudev::grid_reduce_detail::minMaxVal<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<double>, ResType=double, MaskPtr=cv::cudev::WithOutMask]" at line 206 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridFindMinMaxVal_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<double>, ResType=double]" at line 349 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridFindMinMaxVal(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<double>, ResType=double]" at line 68 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu
              instantiation of "void <unnamed>::minMaxImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=double, R=double]" at line 98 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu

  7 errors detected in the compilation of "/opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmax.cu".
  CMake Error at cuda_compile_1_generated_minmax.cu.o.RELEASE.cmake:282 (message):
    Error generating file
    /opt/opencv-python/_skbuild/linux-aarch64-3.11/cmake-build/modules/cudaarithm/CMakeFiles/cuda_compile_1.dir/src/cuda/./cuda_compile_1_generated_minmax.cu.o


  gmake[2]: *** [modules/cudaarithm/CMakeFiles/opencv_cudaarithm.dir/build.make:196: modules/cudaarithm/CMakeFiles/cuda_compile_1.dir/src/cuda/cuda_compile_1_generated_minmax.cu.o] Error 1
  gmake[2]: *** Waiting for unfinished jobs....
  [ 36%] Building CXX object modules/intensity_transform/CMakeFiles/opencv_intensity_transform.dir/src/bimef.cpp.o
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(100): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, thrust::THRUST_200500_870_NS::tuple<volatile uint *, volatile uint *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<int>, cv::cudev::greater<int>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_1<BLOCK_SIZE,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, ResType *, int *, int *, MaskPtr, int, int, int, int) [with BLOCK_SIZE=256, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 164
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 246 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=int]" at line 361 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=int]" at line 69 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=uchar, R=int]" at line 80 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(131): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const unsigned int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<int>, cv::cudev::greater<int>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_2<BLOCK_SIZE,T>(T *, T *, int *, int *, int) [with BLOCK_SIZE=256, T=int]" at line 167
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 246 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=int]" at line 361 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=int]" at line 69 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=uchar, R=int]" at line 80 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(100): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, thrust::THRUST_200500_870_NS::tuple<volatile uint *, volatile uint *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<int>, cv::cudev::greater<int>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_1<BLOCK_SIZE,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, ResType *, int *, int *, MaskPtr, int, int, int, int) [with BLOCK_SIZE=256, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=int, MaskPtr=cv::cudev::GlobPtr<uchar>]" at line 164
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=int, MaskPtr=cv::cudev::GlobPtr<uchar>]" at line 227 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, const MaskPtr &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=int, MaskPtr=cv::cudev::GlobPtrSz<uchar>]" at line 355 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, const MaskPtr &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=int, MaskPtr=cv::cudev::GlobPtrSz<uchar>]" at line 71 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=uchar, R=int]" at line 80 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(100): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, thrust::THRUST_200500_870_NS::tuple<volatile uint *, volatile uint *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<int>, cv::cudev::greater<int>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_1<BLOCK_SIZE,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, ResType *, int *, int *, MaskPtr, int, int, int, int) [with BLOCK_SIZE=256, SrcPtr=cv::cudev::GlobPtr<schar>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 164
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<schar>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 246 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<schar>, ResType=int]" at line 361 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<schar>, ResType=int]" at line 69 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=schar, R=int]" at line 81 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(100): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, thrust::THRUST_200500_870_NS::tuple<volatile uint *, volatile uint *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<int>, cv::cudev::greater<int>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_1<BLOCK_SIZE,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, ResType *, int *, int *, MaskPtr, int, int, int, int) [with BLOCK_SIZE=256, SrcPtr=cv::cudev::GlobPtr<schar>, ResType=int, MaskPtr=cv::cudev::GlobPtr<uchar>]" at line 164
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<schar>, ResType=int, MaskPtr=cv::cudev::GlobPtr<uchar>]" at line 227 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, const MaskPtr &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<schar>, ResType=int, MaskPtr=cv::cudev::GlobPtrSz<uchar>]" at line 355 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, const MaskPtr &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<schar>, ResType=int, MaskPtr=cv::cudev::GlobPtrSz<uchar>]" at line 71 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=schar, R=int]" at line 81 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(100): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, thrust::THRUST_200500_870_NS::tuple<volatile uint *, volatile uint *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<int>, cv::cudev::greater<int>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_1<BLOCK_SIZE,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, ResType *, int *, int *, MaskPtr, int, int, int, int) [with BLOCK_SIZE=256, SrcPtr=cv::cudev::GlobPtr<ushort>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 164
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<ushort>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 246 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<ushort>, ResType=int]" at line 361 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<ushort>, ResType=int]" at line 69 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=ushort, R=int]" at line 82 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(100): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, thrust::THRUST_200500_870_NS::tuple<volatile uint *, volatile uint *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<int>, cv::cudev::greater<int>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_1<BLOCK_SIZE,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, ResType *, int *, int *, MaskPtr, int, int, int, int) [with BLOCK_SIZE=256, SrcPtr=cv::cudev::GlobPtr<ushort>, ResType=int, MaskPtr=cv::cudev::GlobPtr<uchar>]" at line 164
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<ushort>, ResType=int, MaskPtr=cv::cudev::GlobPtr<uchar>]" at line 227 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, const MaskPtr &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<ushort>, ResType=int, MaskPtr=cv::cudev::GlobPtrSz<uchar>]" at line 355 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, const MaskPtr &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<ushort>, ResType=int, MaskPtr=cv::cudev::GlobPtrSz<uchar>]" at line 71 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=ushort, R=int]" at line 82 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(100): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, thrust::THRUST_200500_870_NS::tuple<volatile uint *, volatile uint *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<int>, cv::cudev::greater<int>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_1<BLOCK_SIZE,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, ResType *, int *, int *, MaskPtr, int, int, int, int) [with BLOCK_SIZE=256, SrcPtr=cv::cudev::GlobPtr<short>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 164
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<short>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 246 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<short>, ResType=int]" at line 361 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<short>, ResType=int]" at line 69 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=short, R=int]" at line 83 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(100): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, thrust::THRUST_200500_870_NS::tuple<volatile uint *, volatile uint *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<int>, cv::cudev::greater<int>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_1<BLOCK_SIZE,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, ResType *, int *, int *, MaskPtr, int, int, int, int) [with BLOCK_SIZE=256, SrcPtr=cv::cudev::GlobPtr<short>, ResType=int, MaskPtr=cv::cudev::GlobPtr<uchar>]" at line 164
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<short>, ResType=int, MaskPtr=cv::cudev::GlobPtr<uchar>]" at line 227 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, const MaskPtr &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<short>, ResType=int, MaskPtr=cv::cudev::GlobPtrSz<uchar>]" at line 355 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, const MaskPtr &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<short>, ResType=int, MaskPtr=cv::cudev::GlobPtrSz<uchar>]" at line 71 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=short, R=int]" at line 83 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(100): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, thrust::THRUST_200500_870_NS::tuple<volatile uint *, volatile uint *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<int>, cv::cudev::greater<int>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_1<BLOCK_SIZE,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, ResType *, int *, int *, MaskPtr, int, int, int, int) [with BLOCK_SIZE=256, SrcPtr=cv::cudev::GlobPtr<int>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 164
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<int>, ResType=int, MaskPtr=cv::cudev::WithOutMask]" at line 246 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<int>, ResType=int]" at line 361 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<int>, ResType=int]" at line 69 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=int, R=int]" at line 84 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(100): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, thrust::THRUST_200500_870_NS::tuple<volatile uint *, volatile uint *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<int>, cv::cudev::greater<int>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_1<BLOCK_SIZE,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, ResType *, int *, int *, MaskPtr, int, int, int, int) [with BLOCK_SIZE=256, SrcPtr=cv::cudev::GlobPtr<int>, ResType=int, MaskPtr=cv::cudev::GlobPtr<uchar>]" at line 164
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<int>, ResType=int, MaskPtr=cv::cudev::GlobPtr<uchar>]" at line 227 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, const MaskPtr &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<int>, ResType=int, MaskPtr=cv::cudev::GlobPtrSz<uchar>]" at line 355 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, const MaskPtr &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<int>, ResType=int, MaskPtr=cv::cudev::GlobPtrSz<uchar>]" at line 71 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=int, R=int]" at line 84 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(100): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile float *, volatile float *>, thrust::THRUST_200500_870_NS::tuple<float &, float &>, thrust::THRUST_200500_870_NS::tuple<volatile uint *, volatile uint *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<float>, cv::cudev::greater<float>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_1<BLOCK_SIZE,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, ResType *, int *, int *, MaskPtr, int, int, int, int) [with BLOCK_SIZE=256, SrcPtr=cv::cudev::GlobPtr<float>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 164
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<float>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 246 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<float>, ResType=float]" at line 361 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<float>, ResType=float]" at line 69 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=float, R=float]" at line 85 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(131): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile float *, volatile float *>, thrust::THRUST_200500_870_NS::tuple<float &, float &>, thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const unsigned int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<float>, cv::cudev::greater<float>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_2<BLOCK_SIZE,T>(T *, T *, int *, int *, int) [with BLOCK_SIZE=256, T=float]" at line 167
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<float>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 246 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<float>, ResType=float]" at line 361 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<float>, ResType=float]" at line 69 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=float, R=float]" at line 85 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(100): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile float *, volatile float *>, thrust::THRUST_200500_870_NS::tuple<float &, float &>, thrust::THRUST_200500_870_NS::tuple<volatile uint *, volatile uint *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<float>, cv::cudev::greater<float>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_1<BLOCK_SIZE,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, ResType *, int *, int *, MaskPtr, int, int, int, int) [with BLOCK_SIZE=256, SrcPtr=cv::cudev::GlobPtr<float>, ResType=float, MaskPtr=cv::cudev::GlobPtr<uchar>]" at line 164
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<float>, ResType=float, MaskPtr=cv::cudev::GlobPtr<uchar>]" at line 227 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, const MaskPtr &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<float>, ResType=float, MaskPtr=cv::cudev::GlobPtrSz<uchar>]" at line 355 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, const MaskPtr &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<float>, ResType=float, MaskPtr=cv::cudev::GlobPtrSz<uchar>]" at line 71 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=float, R=float]" at line 85 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(100): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile double *, volatile double *>, thrust::THRUST_200500_870_NS::tuple<double &, double &>, thrust::THRUST_200500_870_NS::tuple<volatile uint *, volatile uint *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<double>, cv::cudev::greater<double>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_1<BLOCK_SIZE,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, ResType *, int *, int *, MaskPtr, int, int, int, int) [with BLOCK_SIZE=256, SrcPtr=cv::cudev::GlobPtr<double>, ResType=double, MaskPtr=cv::cudev::WithOutMask]" at line 164
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<double>, ResType=double, MaskPtr=cv::cudev::WithOutMask]" at line 246 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<double>, ResType=double]" at line 361 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<double>, ResType=double]" at line 69 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=double, R=double]" at line 86 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(131): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile double *, volatile double *>, thrust::THRUST_200500_870_NS::tuple<double &, double &>, thrust::THRUST_200500_870_NS::tuple<volatile int *, volatile int *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const unsigned int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<double>, cv::cudev::greater<double>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_2<BLOCK_SIZE,T>(T *, T *, int *, int *, int) [with BLOCK_SIZE=256, T=double]" at line 167
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<double>, ResType=double, MaskPtr=cv::cudev::WithOutMask]" at line 246 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<double>, ResType=double]" at line 361 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<double>, ResType=double]" at line 69 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=double, R=double]" at line 86 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/detail/minmaxloc.hpp(100): error: no instance of overloaded function "cv::cudev::blockReduceKeyVal" matches the argument list
              argument types are: (thrust::THRUST_200500_870_NS::tuple<volatile double *, volatile double *>, thrust::THRUST_200500_870_NS::tuple<double &, double &>, thrust::THRUST_200500_870_NS::tuple<volatile uint *, volatile uint *>, thrust::THRUST_200500_870_NS::tuple<int &, int &>, const int, thrust::THRUST_200500_870_NS::tuple<cv::cudev::less<double>, cv::cudev::greater<double>>)
            blockReduceKeyVal<BLOCK_SIZE>(smem_tuple(sMinVal, sMaxVal), tie(myMin, myMax),
            ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(113): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,KP0,KP1,KP2,KP3,KP4,KP5,KP6,KP7,KP8,KP9,KR0,KR1,KR2,KR3,KR4,KR5,KR6,KR7,KR8,KR9,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp0,Cmp1,Cmp2,Cmp3,Cmp4,Cmp5,Cmp6,Cmp7,Cmp8,Cmp9>(const thrust::THRUST_200500_870_NS::tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9> &, const thrust::THRUST_200500_870_NS::tuple<KR0, KR1, KR2, KR3, KR4, KR5, KR6, KR7, KR8, KR9> &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const thrust::THRUST_200500_870_NS::tuple<Cmp0, Cmp1, Cmp2, Cmp3, Cmp4, Cmp5, Cmp6, Cmp7, Cmp8, Cmp9> &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(const tuple<KP0, KP1, KP2, KP3, KP4, KP5, KP6, KP7, KP8, KP9>& skeys,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(96): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,VP0,VP1,VP2,VP3,VP4,VP5,VP6,VP7,VP8,VP9,VR0,VR1,VR2,VR3,VR4,VR5,VR6,VR7,VR8,VR9,Cmp>(volatile K *, K &, const thrust::THRUST_200500_870_NS::tuple<VP0, VP1, VP2, VP3, VP4, VP5, VP6, VP7, VP8, VP9> &, const thrust::THRUST_200500_870_NS::tuple<VR0, VR1, VR2, VR3, VR4, VR5, VR6, VR7, VR8, VR9> &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key,
                                                                           ^
  /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp(86): note #3327-D: candidate function template "cv::cudev::blockReduceKeyVal<N,K,V,Cmp>(volatile K *, K &, volatile V *, V &, uint, const Cmp &)" failed deduction
    __attribute__((device)) __inline__ __attribute__((always_inline)) void blockReduceKeyVal(volatile K* skeys, K& key, volatile V* svals, V& val, uint tid, const Cmp& cmp)
                                                                           ^
            detected during:
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc_pass_1<BLOCK_SIZE,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, ResType *, int *, int *, MaskPtr, int, int, int, int) [with BLOCK_SIZE=256, SrcPtr=cv::cudev::GlobPtr<double>, ResType=double, MaskPtr=cv::cudev::GlobPtr<uchar>]" at line 164
              instantiation of "void cv::cudev::grid_minmaxloc_detail::minMaxLoc<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, ResType *, int *, int *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<double>, ResType=double, MaskPtr=cv::cudev::GlobPtr<uchar>]" at line 227 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc_<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, const MaskPtr &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<double>, ResType=double, MaskPtr=cv::cudev::GlobPtrSz<uchar>]" at line 355 of /opt/opencv-python/opencv_contrib/modules/cudev/include/opencv2/cudev/grid/reduce.hpp
              instantiation of "void cv::cudev::gridMinMaxLoc(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cudev::GpuMat_<int> &, const MaskPtr &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<double>, ResType=double, MaskPtr=cv::cudev::GlobPtrSz<uchar>]" at line 71 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu
              instantiation of "void <unnamed>::minMaxLocImpl<T,R>(const cv::cuda::GpuMat &, const cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::GpuMat &, cv::cuda::Stream &) [with T=double, R=double]" at line 86 of /opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu

  17 errors detected in the compilation of "/opt/opencv-python/opencv_contrib/modules/cudaarithm/src/cuda/minmaxloc.cu".
  CMake Error at cuda_compile_1_generated_minmaxloc.cu.o.RELEASE.cmake:282 (message):
    Error generating file
    /opt/opencv-python/_skbuild/linux-aarch64-3.11/cmake-build/modules/cudaarithm/CMakeFiles/cuda_compile_1.dir/src/cuda/./cuda_compile_1_generated_minmaxloc.cu.o


  gmake[2]: *** [modules/cudaarithm/CMakeFiles/opencv_cudaarithm.dir/build.make:210: modules/cudaarithm/CMakeFiles/cuda_compile_1.dir/src/cuda/cuda_compile_1_generated_minmaxloc.cu.o] Error 1
  [ 36%] Building CXX object modules/intensity_transform/CMakeFiles/opencv_intensity_transform.dir/src/intensity_transform.cpp.o
  [ 36%] Building CXX object modules/phase_unwrapping/CMakeFiles/opencv_phase_unwrapping.dir/src/histogramphaseunwrapping.cpp.o
  
  [ 39%] Built target opencv_reg
  [ 39%] Linking CXX static library ../../lib/libopencv_alphamat.a
  [ 39%] Built target opencv_alphamat
  gmake: *** [Makefile:166: all] Error 2
  Traceback (most recent call last):
    File "/tmp/pip-build-env-8g0rej1b/overlay/local/lib/python3.11/dist-packages/skbuild/setuptools_wrap.py", line 668, in setup
      cmkr.make(make_args, install_target=cmake_install_target, env=env)
    File "/tmp/pip-build-env-8g0rej1b/overlay/local/lib/python3.11/dist-packages/skbuild/cmaker.py", line 696, in make
      self.make_impl(clargs=clargs, config=config, source_dir=source_dir, install_target=install_target, env=env)
    File "/tmp/pip-build-env-8g0rej1b/overlay/local/lib/python3.11/dist-packages/skbuild/cmaker.py", line 741, in make_impl
      raise SKBuildError(msg)

  An error occurred while building with CMake.
    Command:
      /tmp/pip-build-env-8g0rej1b/overlay/local/lib/python3.11/dist-packages/cmake/data/bin/cmake --build . --target install --config RELEASE --
    Install target:
      install
    Source directory:
      /opt/opencv-python
    Working directory:
      /opt/opencv-python/_skbuild/linux-aarch64-3.11/cmake-build
  Please check the install target is valid and see CMake's output for more information.

  error: subprocess-exited-with-error

  × Building wheel for opencv-contrib-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /usr/bin/python3.11 /usr/local/lib/python3.11/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmpa4xilxl0
  cwd: /opt/opencv-python
  Building wheel for opencv-contrib-python (pyproject.toml): finished with status 'error'
  ERROR: Failed building wheel for opencv-contrib-python
Failed to build opencv-contrib-python
ERROR: Failed to build one or more wheels
BUILD FAILED (OpenCV 4.8.1)
 ---> Removed intermediate container 6be0a0c73851
 ---> e8d617d36bbc
Successfully built e8d617d36bbc
Successfully tagged opencv-p311:r36.4.0-cp311-opencv_4.8.1
-- Testing container opencv-p311:r36.4.0-cp311-opencv_4.8.1 (opencv:4.8.1/test.py)

docker run -t --rm --runtime=nvidia --network=host \
--volume /home/jetson/repos/jetson-containers/packages/opencv:/test \
--volume /home/jetson/repos/jetson-containers/data:/data \
--workdir /test \
opencv-p311:r36.4.0-cp311-opencv_4.8.1 \
/bin/bash -c 'python3 test.py' \
2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_130807/test/opencv-p311_r36.4.0-cp311-opencv_4.8.1_test.py.txt; exit ${PIPESTATUS[0]}

testing OpenCV...
Traceback (most recent call last):
  File "/test/test.py", line 4, in <module>
    import cv2
ModuleNotFoundError: No module named 'cv2'
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/jetson/repos/jetson-containers/jetson_containers/build.py", line 112, in <module>
    build_container(args.name, args.packages, args.base, args.build_flags, args.build_args, args.simulate, args.skip_tests, args.test_only, args.push, args.no_github_api, args.skip_packages)
  File "/home/jetson/repos/jetson-containers/jetson_containers/container.py", line 154, in build_container
    test_container(container_name, pkg, simulate)
  File "/home/jetson/repos/jetson-containers/jetson_containers/container.py", line 327, in test_container
    status = subprocess.run(cmd.replace(_NEWLINE_, ' '), executable='/bin/bash', shell=True, check=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jetson/.pyenv/versions/3.11.10/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'docker run -t --rm --runtime=nvidia --network=host --volume /home/jetson/repos/jetson-containers/packages/opencv:/test --volume /home/jetson/repos/jetson-containers/data:/data --workdir /test opencv-p311:r36.4.0-cp311-opencv_4.8.1 /bin/bash -c 'python3 test.py' 2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241028_130807/test/opencv-p311_r36.4.0-cp311-opencv_4.8.1_test.py.txt; exit ${PIPESTATUS[0]}' returned non-zero exit status 1.

dusty-nv · 2024-10-31T12:52:08Z

Hi @0Unkn0wn, thanks for the logs, I just tried jetson-containers build ros:jazzy-desktop on JetPack 6.1 and hit some opencv errors in imageproc package (different than yours I think?)

Also going back and trying it with OpenCV 4.8.1 now. However prior to that OpenCV may encounter errors with the newer versions of CUDA, that had needed fixed in OpenCV. So there could encounter a situation where it's not possible to build both ROS + OpenCV/CUDA together. So either ROS would need patched for newer OpenCV, or use the vanilla OpenCV built without CUDA.

0Unkn0wn · 2024-10-31T13:09:27Z

Hi @dusty-nv, thank you for the updates. Just wondering, if would it be possible to skip OpenCV for my build with ROS2 Jazzy and Jax? I’m not planning to use it right away, so if there’s a way to leave it out or any workaround you’d suggest, that’d be great. If it works with OpenCV 4.8.1, that would work for me as well. Thanks again!

dusty-nv · 2024-10-31T16:36:09Z

Hi @0Unkn0wn, you can try removing opencv from the depends list here:

jetson-containers/packages/ros/config.py

Line 10 in 20e7292

template['depends'] = ['cuda', 'cudnn', 'tensorrt', 'opencv', 'cmake']

And then remove opencv from these skip_keys:

jetson-containers/packages/ros/ros2_build.sh

Line 109 in 20e7292

    
           SKIP_KEYS="libopencv-dev libopencv-contrib-dev libopencv-imgproc-dev python-opencv python3-opencv"

0Unkn0wn · 2024-11-03T22:24:58Z

Hi @dusty-nv I managed to build the container with Jazzy and Jax by removing OpenCV as you have shown me so thank you very much!

For future reference, this is the build that worked: CUDA_VERSION=12.6 jetson-containers build --skip-tests=all --name=ros_jax ros:jazzy-ros-base jax:0.4.32

Just one last question, any idea why it only works when I don’t set the Python version to 3.11 or 3.12?

Error log

#All required rosdeps installed successfully
Traceback (most recent call last):
  File "/usr/bin/colcon", line 33, in <module>
    sys.exit(load_entry_point('colcon-core==0.18.1', 'console_scripts', 'colcon')())
  File "/usr/lib/python3/dist-packages/colcon_core/command.py", line 128, in main
    return _main(
  File "/usr/lib/python3/dist-packages/colcon_core/command.py", line 186, in _main
    args = parser.parse_args(args=argv)
  File "/usr/lib/python3/dist-packages/colcon_defaults/argument_parser/defaults.py", line 166, in parse_args
    return self._parser.parse_args(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/colcon_argcomplete/argument_parser/argcomplete/__init__.py", line 85, in parse_args
    from argcomplete import autocomplete
ModuleNotFoundError: No module named 'argcomplete'
The command '/bin/bash -c /tmp/ros2_build.sh' returned a non-zero code: 1
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/jetson/repos/jetson-containers/jetson_containers/build.py", line 112, in <module>
    build_container(args.name, args.packages, args.base, args.build_flags, args.build_args, args.simulate, args.skip_tests, args.test_only, args.push, args.no_github_api, args.skip_packages)
  File "/home/jetson/repos/jetson-containers/jetson_containers/container.py", line 147, in build_container
    status = subprocess.run(cmd.replace(_NEWLINE_, ' '), executable='/bin/bash', shell=True, check=True)
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'DOCKER_BUILDKIT=0 docker build --network=host --tag ros_jax:r36.4.0-cu126-cp311-ros_jazzy-ros-base --file /home/jetson/repos/jetson-containers/packages/ros/Dockerfile.ros2 --build-arg BASE_IMAGE=ros_jax:r36.4.0-cu126-cp311-cmake --build-arg ROS_VERSION="jazzy" --build-arg ROS_PACKAGE="ros_base" /home/jetson/repos/jetson-containers/packages/ros 2>&1 | tee /home/jetson/repos/jetson-containers/logs/20241103_203530/build/ros_jax_r36.4.0-cu126-cp311-ros_jazzy-ros-base.txt; exit ${PIPESTATUS[0]}' returned non-zero exit status 1.

dusty-nv · 2024-11-04T16:13:25Z

Just one last question, any idea why it only works when I don’t set the Python version to 3.11 or 3.12?
ModuleNotFoundError: No module named 'argcomplete'

Considering that the error is emanating from colcon, you could try adding a pip3 install argcomplete beforehand, however it also may be untested upstream to build ROS against a version of python different from what the core OS ships with. For Jazzy, Ubuntu 24.04 is teir-0, but these base images for JetPack 6 are on Ubuntu 22.04 for compatibility with CUDA. If you try changing the PYTHON_VERSION, it will also need to rebuild lots of other packages (like PyTorch/ect) during the container builds, as I only build/serve the wheels for the default version of Python (so 3.10 for JP6). Changing the python version isn't impossible but error-prone as you have found, due to the amount of packages that need rebuilt (assuming you are using the AI/ML stack)

martincerven · 2024-11-21T17:10:31Z

There is some API breaking from ROS's image_proc with higher OpenCV versions. (4.6.0 is default OpenCV shipped with 24.04 )

Also for the 12.6 CUDA support in OpenCV you need to use latest branch, but that breaks some packages in ROS such as image_proc

But for those of us who just want to run Jazzy with GPU support and don't care about CUDA or OpenCV versions this is still amazing! Thanks @dusty-nv @0Unkn0wn

dusty-nv · 2024-11-21T19:45:45Z

ok thanks @martincerven , I can go back and build OpenCV 4.6.0 for JetPack 6.1. Am I correct in understanding that may allow the build to continue on?

martincerven · 2024-11-21T20:19:16Z

I think so. I built both base and desktop. (although I just skipped CUDA OpenCV and used the one that comes with Ubuntu).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROS2 Jazzy fails to build because of OpenCV #702

ROS2 Jazzy fails to build because of OpenCV #702

0Unkn0wn commented Oct 28, 2024

0Unkn0wn commented Oct 28, 2024 •

edited

Loading

dusty-nv commented Oct 31, 2024

0Unkn0wn commented Oct 31, 2024

dusty-nv commented Oct 31, 2024

0Unkn0wn commented Nov 3, 2024

dusty-nv commented Nov 4, 2024

martincerven commented Nov 21, 2024 •

edited

Loading

dusty-nv commented Nov 21, 2024

martincerven commented Nov 21, 2024

ROS2 Jazzy fails to build because of OpenCV #702

ROS2 Jazzy fails to build because of OpenCV #702

Comments

0Unkn0wn commented Oct 28, 2024

0Unkn0wn commented Oct 28, 2024 • edited Loading

dusty-nv commented Oct 31, 2024

0Unkn0wn commented Oct 31, 2024

dusty-nv commented Oct 31, 2024

0Unkn0wn commented Nov 3, 2024

dusty-nv commented Nov 4, 2024

martincerven commented Nov 21, 2024 • edited Loading

dusty-nv commented Nov 21, 2024

martincerven commented Nov 21, 2024

0Unkn0wn commented Oct 28, 2024 •

edited

Loading

martincerven commented Nov 21, 2024 •

edited

Loading