Skip to content

CPU GPU utilization of ffio

dongrixinyu edited this page Jun 4, 2024 · 14 revisions

CPU Usage for Decoding

My test involves pulling a single original video stream, doing some intermediate processing (consuming about 50% of CPU), then re-encoding and pushing it to an RTMP server(about 12% of CPU, using GPU encoding). I then watch the final processed live stream.

Here's my setup:

  • CPU: Intel Xeon Gold 5118 2.30GHz x8
  • GPU: Nvidia Tesla V100 32GB
  • Origin Video: 1080p 24fps, 4Mbps h264-baseline The CPU usage below are not exact measurements, they are merely my intuitive perception from observing htop.
Decoding Scenario Min Avg Max
Single stream with CPU 18% 22% 30%
Single stream with GPU 20% 23% 25%

When it comes to 6-stream parallel decoding:

  • With GPU: CPU usage stabilizes at nearly 100% across all 8 cores, and the resulting video stream is almost smooth.
  • With CPU: The total CPU usage fluctuates between 40% and 80%, but the video is more stuttered compared to using GPU. In my case, I might opt for the GPU solution as it appears more stable, although it seems not so friendly to energy efficiency

GPU usage statistics

There are two steps containing the use of GPU, decoding and encoding video streams respectively. For each step, there are 2 parts using GPU, h264 and pixel format conversion.

Here is a table describe the usage of GPU(nvidia) based on different conditions.(framerates are both 25fps)

image-size hw_decoding hw_yuv->rgb hw_encoding hw_rgb->yuv GPU usage CPU usage
1280*720 407M
1280*720 234M 13% core
1280*720 131M 28% core
1280*720 0 47% core
1280*720 103M 47% core
1920*1080 487M
1920*1080 268M 23% core
1920*1080 161M 44% core
1920*1080 0 83% core
1920*1080 107M 83% core
  • when set hw_enabled=True and pix_fmt_hw_enabled=True, the speed of pixel format conversion is largely accelerated.
  • when set hw_enabled=False and pix_fmt_hw_enabled=True the cpu consumption seems more than pure CPU. It MAY caused by the cudaMalloc method.
Clone this wiki locally