doca udp frame builder performance improvement #12

masaruito110 · 2024-04-19T15:16:45Z

#Purpose

Based on 7de28d5 I measured the performance of docagpunetio.

Current server structure

|doca flow| ---------> |frame builder|

frame builder structure

|receive_tcp|<--semaphore-->|makeframe|<--semaphore-->|notify frame built|

#Environment
Environment is the same as #10

#Result

The difference with #10 is that we cannot set chunk size because udp doesn't check ack. Hence, we just show maximum performance. The trend looks the same as #10

env	process	session/process	Gbps/session
1	1	1	43
2	2	1	18

The text was updated successfully, but these errors were encountered:

masaruito110 · 2024-04-19T16:30:56Z

I run the simple app that heavily copy while frame builder runs.

The result is similar to that of multiple sessions.

env	process	session/process	Gbps/session
1	1	1	18

Doca seems to be influenced by other heavily copy kernels.

Simple app is below.

File Edit Options Buffers Tools C++ Help
#include <cuda_runtime.h>
#include <stdint.h>
#include <stdio.h>

__global__ void heavy_memcpy(uint8_t* dst, uint8_t* src, size_t chunk, size_t frame_size)
{

    size_t cnt = 0;
    while (true) {
        cnt++;
        if (cnt % 1000 && threadIdx.x == 0) {
            printf("copying %d\n", cnt);
        }
        for (int i = threadIdx.x; i < frame_size / chunk - 1; i += blockDim.x) {
            cudaMemcpyAsync(dst + i * chunk, src + i * chunk, chunk, cudaMemcpyDeviceToDevice);
        }
    }
}

void heavy_memcpy_cpu()
{
    uint8_t* dst;
    uint8_t* src;

    size_t frame_size = (size_t)4 * 1024 * 1024 * 1024;
    size_t chunk = 8000;

    cudaMalloc((void**)&dst, frame_size);
    cudaMalloc((void**)&src, frame_size);

    heavy_memcpy<<<1, 1024>>>(dst, src, chunk, frame_size);
    cudaDeviceSynchronize();
}

masaruito110 mentioned this issue Apr 19, 2024

doca tcp frame builder performance improvement #10

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doca udp frame builder performance improvement #12

doca udp frame builder performance improvement #12

masaruito110 commented Apr 19, 2024 •

edited

Loading

masaruito110 commented Apr 19, 2024

doca udp frame builder performance improvement #12

doca udp frame builder performance improvement #12

Comments

masaruito110 commented Apr 19, 2024 • edited Loading

masaruito110 commented Apr 19, 2024

masaruito110 commented Apr 19, 2024 •

edited

Loading