K510 Multimedia Developer Guide

Document version: V1.0.0

Published: 2022-03-09

Disclaimer The products, services or features you purchase shall be subject to the commercial contracts and terms of Beijing Canaan Jiesi Information Technology Co., Ltd. ("the Company", the same hereinafter), and all or part of the products, services or features described in this document may not be within the scope of your purchase or use. Except as otherwise agreed in the contract, the Company disclaims all representations or warranties, express or implied, as to the accuracy, reliability, completeness, marketing, specific purpose and non-aggression of any representations, information, or content of this document. Unless otherwise agreed, this document is provided as a guide for use only. Due to product version upgrades or other reasons, the contents of this document may be updated or modified from time to time without any notice.

Trademark Notices

"", "Canaan" icon, Canaan and other trademarks of Canaan and other trademarks of Canaan are trademarks of Beijing Canaan Jiesi Information Technology Co., Ltd. All other trademarks or registered trademarks that may be mentioned in this document are owned by their respective owners.

Copyright ©2022 Beijing Canaan Jiesi Information Technology Co., Ltd This document is only applicable to the development and design of the K510 platform, without the written permission of the company, no unit or individual may disseminate part or all of the content of this document in any form.

Beijing Canaan Jiesi Information Technology Co., Ltd URL: canaan-creative.com Business Enquiries: salesAI@canaan-creative.com

# preface ## Document purpose This document is an explanatory document for the K510 Multimedia application example. ## Target audience For whom this document is intended: - Software developers - Technical support personnel

Revision history

The version number	Modified by	Date of revision	Revision Notes
v1.0.0	System software groups	2022-03-09	SDK V1.5 released

**Contents**

[TOC]

1 Encoder API

1.1 Header File Description

k510_buildroot/package/encode_app/enc_interface.h

1.2 API function descriptions

1.2.1 VideoEncoder_Create

【Description】

Create a video encoder

【Grammar】

EncoderHandle* VIdeoEncoder_Create(EncSettings *pCfg)

【Parameters】

pCfg: Enter the encoding configuration parameters

The parameter name	Parameter interpretation	The value range	Applicable encoding modules
channel	Channel number, supports up to 8 coded channels	[0，7]	jpeg、avc
width	Encodes the image width	avc: [128,2048], multiple of 8 jpeg: up to 8192, multiple of 16	jpeg、avc
height	Encode the height of the image	avc: [64,2048], multiple of 8 jpeg: up to 8192, multiple of 2	jpeg、avc
FrameRate	Frame rate, which can only be configured to a fixed few values	(25,30,50,60,75)	jpeg、avc
rcMode	Bitrate control mode 0:CONST_QP 1:CBR 2:VBR jpeg is fixed to CONST_QP	See RateCtrlMode	jpeg,avc
BitRate	Target bitrate in CBR mode or lowest bitrate in VBR mode	[10,20000000]	stroke
MaxBitRate	The highest bitrate in VBR mode	[10,20000000]	stroke
SliceQP	The initial QP value, -1 for auto	avc:-1,jpeg[0,51] :[1,100]	jpeg,avc
MinQP	The minimum qp value	[0,sliceqp]	stroke
MaxQP	The maximum qp value	[sliceqp,54]	stroke
profile	profile_idc parameters in SPS: 0: base 1:main 2:high 3:jpeg	[0,3]	jpeg,avc
level	level_idc parameters in PS	[10,42]	stroke
AspectRatio	Display scale	See AVC_AspectRatio	avc
FreqIDR	The interval between two idr frames	[1,1000]	stroke
gopLen	Group Of Picture, the interval between two I frames	[1,1000]	stroke
bEnableGDR	Whether to enable in-frame refresh	[true,false]	stroke
gdrMode	gdr refresh mode: 0, vertical refresh 1, horizontal refresh	See GDRCtrlMode	stroke
bEnableLTR	Whether long-term reference frames are enabled	[true,false]	stroke
roiCtrlMode	ROI control mode: 0: Do not use roi 1: relative qp 2: absolute qp	See ROICtrlMode	stroke
EncSliceSplitCfg	slice split deployment		stroke
bSplitEnable	Whether Slice splitting is enabled	[true,false]	stroke
u32SplitMode	Slice segmentation mode: 0: Split by bits. 1: Split by macroblock rows	[0,1]	stroke
u32SliceSize	u32SplitMode=0, indicating the number of bytes per slice u32SplitMode=1, represents the number of macroblock rows per slice	u32SplitMode=0，[100,65535] u32SplitMode=1，[1, (image height +15)/16]	stroke
entropyMode	Entropy encoding, 0: CABAC 1: CAVLC	See EncEntropyMode	stroke
encDblkCfg	Block filtering configuration		stroke
disable_deblocking_filter_idc	The default value is 0, which means H.264 Agreement	[0，2]	stroke
slice_alpha_c0_offset_div2	The default value is 0, which means H.264 Agreement	[-6，6]	stroke
slice_beta_offset_div2	The default value is 0, which means H.264 Agreement	[-6, 6]	stroke

typedef struct
{
    int                       channel;  //encode channel number
    unsigned short            width;
    unsigned short            height;
    unsigned char             FrameRate;
    RateCtrlMode              rcMode;
    unsigned int              BitRate;
    unsigned int              MaxBitRate;
    int                       SliceQP;  //auto: -1, or from 0 to 51
    int                       MinQP;//from 0 to SliceQP
    int                       MaxQP;//from SliceQP to 51
    AVC_Profile               profile;
    unsigned int              level;  //1 .. 51, 51 is 5.1
    AVC_AspectRatio           AspectRatio;
    int                       FreqIDR; //default value  : -1,IDR:number of frames between two IDR pictures;GDR:refresh period
    unsigned int              gopLen;  
    bool                      bEnableGDR;//gdr
    GDRCtrlMode               gdrMode;
    bool                      bEnableLTR;//Long Term reference

    ROICtrlMode               roiCtrlMode;
    EncSliceSplitCfg          sliceSplitCfg;
    EncEntropyMode            entropyMode;//Profile is set to AVC_MAIN or AVC_HIGH is valid
    EncDblkCfg                encDblkCfg;
}EncSettings;
typedef enum
{
    CONST_QP,
    CBR,
    VBR
} RateCtrlMode;
typedef enum
{
    AVC_C_BASELINE,
    AVC_MAIN,
    AVC_HIGH,
    JPEG
} AVC_Profile;
typedef enum
{
    ASPECT_RATIO_AUTO, 
    ASPECT_RATIO_1_1,
    ASPECT_RATIO_4_3, 
    ASPECT_RATIO_16_9, 
    ASPECT_RATIO_NONE,
    ASPECT_RATIO_MAX,
} AVC_AspectRatio;
typedef struct
{
    unsigned int          s32X;
    unsigned int          s32Y;
    unsigned int          u32Width;
    unsigned int          u32Height;
} RECT_S;
typedef struct
{
    unsigned int          uIndex;//index[0-7]
    bool                  bEnable;
    int                   uQpValue;
    RECT_S                stRect;
} EncROICfg;
typedef enum
{
    ROI_QP_TABLE_NONE,
    ROI_QP_TABLE_RELATIVE,//[-32,31],6 LSBs effective
    ROI_QP_TABLE_ABSOLUTE,//[0,51],6 LSBs effective
} ROICtrlMode;
typedef enum
{
    GDR_VERTICAL = 0,
    GDR_HORIZONTAL,
    GDR_CTRLMAX,
} GDRCtrlMode;
typedef struct
{
    bool bSplitEnable;
    unsigned int u32SplitMode; // 0:splite by byte; 1:splite by slice count
    unsigned int u32SliceSize;
}EncSliceSplitCfg;

typedef enum
{
    ENTROPY_MODE_CAVLC = 0,
    ENTROPY_MODE_CABAC,
    ENTROPY_MODE_MAX,
}EncEntropyMode;

typedef struct
{
    unsigned int  disable_deblocking_filter_idc;//[0,2]
    int  slice_alpha_c0_offset_div2;//[-6,6]
    int  slice_beta_offset_div2;//[-6,6]
}EncDblkCfg;

【Return value】

typedef void* EncoderHandle

1.2.2 VideoEncoder_SetRoiCfg

【Description】

roi setting, support up to 8 rectangular areas, the system according to the index number of 0 ~ 7 to manage the ROI area, uIndex indicates that the user sets the index number of ROI. ROI regions can be superimposed on each other, and when an overlay occurs, the priority between ROI regions increases sequentially from index number 0 to 7.

It can be used after the encoder is created and before it is destroyed. The roi region can be dynamically adjusted during the encoding process.

【Grammar】

EncStatus VideoEncoder_SetRoiCfg(EncoderHandle *hEnc,const EncROICfg*pEncRoiCfg);

【Parameters】

hEnc: The handle returned at creation time

pEncRoiCfg: Roi zone configuration information

typedef struct
{
    unsigned int          s32X;
    unsigned int          s32Y;
    unsigned int          u32Width;
    unsigned int          u32Height;
}RECT_S;

typedef struct 
{
    unsigned int          uIndex;//index[0-7]
    bool                  bEnable;
    int                   uQpValue;
    RECT_S                stRect;
}EncROICfg;

Parameter description

uIndex     - 指定该roi区域索引号，范围0-7最多支持8个区域
bEnable    - 指定该区域是否使能，只有使能的区域才有效
uQpValue   - qp值，可以是相对qp或绝对qp，qp模式由EncSettings中roiCtrlMode属性决定。绝对qp范围                  [0,51]，相对qp范围[-31,31]
stRect     - roi矩形区域，s32X矩形左上角x值，s32Y矩形左上角y值，u32Width矩形宽度，u32Height矩形高度

【Return value】

typedef enum
{
    Enc_SUCCESS = 0, 
    Enc_ERR = 1,
}EncStatus;

1.2.3 VideoEncoder_SetLongTerm

【Description】

Sets the next frame of the encoding to a long-term reference frame. It can be used after the encoder is created and before it is destroyed. The bEnableLTR attribute in EncSettings determines whether the feature is enabled.

【Grammar】

EncStatus VideoEncoder_SetLongTerm(EncoderHandle *hEnc);

【Parameters】

hEnc: The handle returned at creation time

【Return value】

typedef enum
{
    Enc_SUCCESS = 0, 
    Enc_ERR = 1,
}EncStatus;

1.2.4 VideoEncoder_UseLongTerm

【Description】

Sets the encoding to the next frame using a long-term reference frame. It can be used after the encoder is created and before it is destroyed. The bEnableLTR attribute in EncSettings determines whether the feature is enabled.

【Grammar】

EncStatus VideoEncoder_UseLongTerm(EncoderHandle *hEnc);

【Parameters】

hEnc: The handle returned at creation time

【Return value】

typedef enum
{
    Enc_SUCCESS = 0,
    Enc_ERR = 1,
}EncStatus;

1.2.5 VideoEncoder_InsertUserData

【Description】

Insert user data.

It can be used after the encoder is created and before it is destroyed, and the user data content can be modified in real time during the encoding process. The user data will be inserted into the SEI data area of the IDR frame.

【Grammar】

EncStatus      VideoEncoder_InsertUserData(EncoderHandle *hEnc,char*pUserData,unsigned int nlen);

【Parameters】

hEnc: The handle returned at creation time

pUserData: A pointer to user data

nlen: User data length (0, 1024)

【Return value】

typedef enum
{
    Enc_SUCCESS = 0,
    Enc_ERR = 1,
}EncStatus;

1.2.6 VideoEncoder_Destory

【Description】

Destroy the video encoder

【Grammar】

EncStatus VideoEncoder_Destroy(EncoderHandle *hEnc)

【Parameters】

hEnc: The handle returned at creation time

【Return value】

typedef enum
{
    Enc_SUCCESS = 0, 
    Enc_ERR = 1,
}EncStatus;

1.2.7 VideoEncoder_EncodeOneFrame

【Description】

Encode a video frame

【Grammar】

EncStatus VideoEncoder_EncodeOneFrame(EncoderHandle *hEnc, EncInputFrame *input)

【Parameters】

hEnc: The handle returned at creation time

input: Enter the YUV video data

typedef struct
{
    unsigned short width;
    unsigned short height;
    unsigned short stride;
    unsigned char *data;
}EncInputFrame;

【Return value】

Enc_SUCCESS = 0,
Enc_ERR = 1

1.2.8 VideoEncoder_GetStream

【Description】

Gets the buffer of the video encoding stream, Note: This buffer space is allocated internally by the encoder.

【Grammar】

EncStatus VideoEncoder_GetStream(EncoderHandle *hEnc, EncOutputStream *output)

【Parameters】

hEnc: The handle returned at creation time

output: Output the encoded stream data buffer, bufSize is greater than 0 to have the output

typedef struct
{
    unsigned char *bufAddr;
    unsigned int bufSize; 
}EncOutputStream;

【Return value】

Enc_SUCCESS = 0,
Enc_ERR = 1

1.2.9 VideoEncoder_GetStream_ByExtBuf

【Description】

Gets the buffer of the video encoding stream, Note: The buffer space needs to be allocated by the consumer before calling this function.

【Grammar】

EncStatus VideoEncoder_GetStream(EncoderHandle *hEnc, EncOutputStream *output)

【Parameters】

hEnc: The handle returned at creation time

output: Output the encoded stream data buffer, bufSize is greater than 0 to have the output

typedef struct
{
    unsigned char *bufAddr;
    unsigned int bufSize; 
}EncOutputStream;

【Return value】

Enc_SUCCESS = 0,
Enc_ERR = 1

1.3.0 VideoEncoder_ReleaseStream

【Description】

Release the buffer of the video encoding stream

【Grammar】

EncStatus VideoEncoder_ReleaseStream(EncoderHandle *hEnc, EncOutputStream *output)

【Parameters】

hEnc: The handle returned at creation time
output:VideoEncoder_GetStream the buffer returned

【Return value】

Enc_SUCCESS = 0,
Enc_ERR = 1

2 Hardware structure diagram and software architecture

2.1 Hardware Structure Diagram

The hardware block diagram of the K510 is as follows:

The data received from the video sensor is processed by MIPI DPHY, CSI, VI, isP to obtain the yuv source data and stored in the DDR. The h264 encoder module reads data from the DDR, performs encoding operations, and stores the results of the operations in the DDR.

2.2 Software Architecture

The software architecture of the multimedia development platform is as follows:

thereinto

libvenc: Encoder library for calling h264 encoder core
libmediactl: Isp library for controlling sensors
libaudio3a: Audio3a library for 3a operations on audio
alsa-lib: Audio library for controlling the audio interface

3 Demo app

3.1 Encode Application

The program is placed/app/encode_app in the directory:

encode_app: Encode application program
The yuv file used for testing is large in size and does not fit into the SDK package

runencode_app

The parameter name	Parameter interpretation	The default value	The value range	Applicable encoding modules
help	Help information
split	The number of channels	NULL	[1,4]	jpeg、avc
ch	Channel number (0-based)	NULL	[0,3]	jpeg、avc
i	Enter the YUV file, onlysupport nv12 format	NULL	v4l2 xxx.yuv	jpeg、avc
dev	v4l2 device name	NULL	sensor0: /dev/video3 /dev/video4 sensor1: /dev/video7 / dev/ video8	jpeg、avc
or	output	NULL	rtsp xxx.264 xxx.mjpeg xxx.jpg	jpeg、avc
in	Output image width	1920	avc: [128,2048], multiple of 8 jpeg: up to 8192, multiple of 16	jpeg、avc
h	Output image height	1080	avc: [64,2048], multiple of 8 jpeg: up to 8192, multiple of 2	jpeg、avc
fps	The camera captures frame rates, which currently only support 30pfs	30	(30, 60, 75) According to v4l2 config file	stroke
r	Encoded output frame rate	30	The number that can divisible or be divisible by fps	stroke
inframes	Enter the number of yuv frames	0	[0,50]	jpeg、avc
outframes	The output of the yuv frames, if larger than the parameter -inframes, will be repeated encoding	0	[0,32767]	jpeg、avc
gop	Group Of Picture, the interval between two I frames	25	[1,1000]	avc
rcmode	Represents bitrate control mode 0:CONST_QP 1:CBR 2:VBR	CBR	[0,2]	avc
bitrate	Target bitrate in CBR mode or lowest bitrate in VBR mode, in KB	4000	[1,20000]	avc
maxbitrate	The highest bitrate in VBR mode, in Kb	4000	[1,20000]	stroke
profile	profile_idc parameters in SPS: 0: base 1:main 2:high 3:jpeg	AVC_HIGH	[0,3]	jpeg、avc
level	level_idc parameters in SPS	42	[10,42]	stroke
sliceqp	The initial QP value, -1 for auto	25	avc:-1,jpeg[0,51] :[1,100]	jpeg、avc
minqp	The minimum QP value	0	[0,sliceqp]	avc
maxqp	The maximum QP value	51	[sliceqp,51]	avc
enableGDR	Enbale intra refresh and specifies intra refresh peroid. 0: Disable intra refresh. Positive: Intra refresh peroid	0	[0,65535]	avc
GDRMode	Intra refresh mode	0(GDR_VERTICAL)	0-GDR_VERTICAL 1-GDR_HORIZONTAL	avc
enableLTR	Enables long-term reference frames, and parameters specify the refresh period. 0: The refresh cycle is not enabled. Positive: Periodically sets the reference frame and the next frame is set to use the long reference frame	0	[0,65535]	avc
roi	Roi configuration file, which specifies multiple roi regions	NULL	xxx.conf	stroke
disableAE	Disale AE	0	0-Enable AE 1-Disable AE The switch of AE is related to the sensor, so turning off the AE function of a certain dev will also turn off the AE functions of the other devs corresponding to the sensor of the dev.	avc
Conf	The vl42 configuration file modifies the v4l2 configuration parameters based on the specified configuration file and the command line input parameters	NULL	xxx.conf	stroke
alsa	Enable alsa	0(disable)	0-disable 1-enable	audio
ac	Number of audio channels	2	2	audio
ar	Audio sample rate	44100	up to 48000	audio
af	Audio sample format	2(SND_PCM_FORMAT_S16_LE)	2-SND_PCM_FORMAT_S16_LE 3-SND_PCM_FORMAT_S16_BE 4-SND_PCM_FORMAT_U16_LE 5-SND_PCM_FORMAT_U16_BE	audio
ad	Audio device	hw:0	hw:0	audio

3.1.1 Enter the yuv file and output the file

./encode_app -split 1 -ch 0 -i your_file.yuv -o out.264 -w 1920 -h 1080 -inframes 10 -outframes 30
./encode_app -split 1 -ch 0 -i your_file.yuv -o out.mjpeg -w 1920 -h 1080 -inframes 10 -outframes 30

3.1.2 Input v4l2, output rtsp push stream

3.1.2.1 Single Channel

./encode_app -split 1 -ch 0 -i v4l2 -dev /dev/video3 -o rtsp -w 1920 -h 1080 -conf video_sample.conf

Example of a ffplay pull command:

 ffplay -rtsp_transport tcp rtsp://192.168.137.11:8554/testStream

rtsp://192.168.137.11:8554/testStreamFor the rtsp stream url address, -rtsp_transport tcp means to use tcp to transmit audio and video data (udp is used by default), and the -fflags nobuffer option can be added to avoid increased latency due to player caching.

3.1.2.2 Single camera dual channel

./encode_app -split 2 -ch 0 -i v4l2 -dev /dev/video3 -o rtsp -w 1920 -h 1080 -ch 1 -i v4l2 -dev /dev/video4 -o rtsp -w 1280 -h 720 -conf video_sample.conf

The ffplay pull stream command is the same as above.

3.1.2.3 Dual Cameras

./encode_app -split 2 -ch 0 -i v4l2 -dev /dev/video3 -o rtsp -w 1920 -h 1080 -ch 1 -i v4l2 -dev /dev/video7 -o rtsp -w 1920 -h 1080 -conf video_sample.conf

The ffplay pull stream command is the same as above.

3.1.2.4 ROI test

./encode_app -split 1 -ch 0 -i v4l2 -dev /dev/video3 -o rtsp -w 1920 -h 1080 -sliceqp -1 -bitrate 2048 -roi roi_1920x1080.conf -conf video_sample.conf

roi file format

{
  "roiCtrMode": 1,
  "roiRegion": [
    {
      "qpValue": -15,
      "qpRegion": {
        "left": 0,
        "top": 0,
        "width": 500,
        "heigth": 500
      }
    }
  ]
}

Parameter description:

roiCtrMode - 1:相对qp  2:绝对qp
roiRegion  - roi区域，为多个区域数组，最多支持8个区域。
qpValue    - 指定该区域使用的qp值，相对qp范围:[-31,31]     绝对qp范围:[0,51]
qpRegion   - roi矩形区域
left       - 矩形区域的左上角X坐标
top        - 矩形区域的左上角Y坐标
width      - 矩形区域的宽度
heigth     - 矩形区域的高度

The ffplay pull stream command is the same as above.

3.1.3 Frame rate transformation

./encode_app -split 1 -ch 0 -i v4l2 -dev /dev/video3 -r 60 -o rtsp -w 1920 -h 1080 -conf video_sample.conf

The ffplay pull stream command is the same as above.

3.1.4 Multiple input frame rates

VGA@75fps and 720p60 are currently supported

./encode_app -split 1 -ch 0 -i v4l2 -dev /dev/video3 -o rtsp -w 640 -h 480 -fps 75 -r 75 -conf video_sample_vga480p75.conf
./encode_app -split 1 -ch 0 -i v4l2 -dev /dev/video3 -o rtsp -w 1280 -h 720 -fps 60 -r 60 -conf video_sample_720p60.conf

The ffplay pull stream command is the same as above.

3.1.5 rtsp push audio and video streams

./encode_app -split 1 -ch 0 -i v4l2 -dev /dev/video3 -o rtsp -w 1920 -h 1080 -alsa 1 -ac 2 -ar 44100 -af 2 -ad hw:0 -conf video_sample.conf

The ffplay pull stream command is the same as above.

3.1.6 Precautions

Operating environment: Core board sensor: IMX219_SENSOR
rtsp stream address format: rtsp://ip address: port number/testStream, where ip address and port number are variable and the rest are fixed.

Such as: rtsp://192.168.137.11:8554/testStream, where the IP address is 192.168.137.11, the port number is 8554.

IP address: The IP address of the development board, enter ifconfig on the board to obtain.

Port number: 8554 + <通道号>*2, channel numbers generally start from 0 (-ch 0, -ch 1...).
Play RTSP stream mode: the corresponding RTSP stream can be played through vlc or ffplay, and the data stream can be transmitted through the udp or TCP protocol.

1)rtp over udp播放：ffplay -rtsp_transport udp rtsp://192.168.137.11:8554/testStream

2)rtp over tcp 播放: ffplay -rtsp_transport tcp rtsp://192.168.137.11:8554/testStream

It is recommended to use rtp over tcp to play to avoid the screen caused by udp packet loss.

3.2 ffmpeg

ffmpeg is placed in the /usr/local/bin directory.

ffmpeg: ffmpeg app.

runffmpeg

(1) Encoder libk510_h264 parameter

The parameter name	Parameter interpretation	The default value	The value range
g	gop size	25	1~1000
b	bitrate	4000000	1000~20000000
r	Frame rate, since isps currently only support 30fps, so the decoder should be set to 30	30	30
idr_freq	IDR frequency	-1 (no IDR)	-1~256
qp	When encoding with cqp, configure the qp value	-1(auto)	-1~100
maxrate	The maximum value of the bitrate	0	20000000
profile	Supported profiles	2(high)	0 - baseline 1 - main 2 - high
level	Encode level	42	10~42
aratio	Screen aspect ratio	0(auto)	0 - auto 1 - 1:1 2 - 4:3 3 - 16:9 4 - none
ch	channel number	0	0-7

(2) Encoder libk510_jpeg parameters

The parameter name	Parameter interpretation	The default value	The value range
qp	When encoding with cqp, configure the qp value	25	-1~100
r	framerate	30	30
ch	encode channel	0	0~7
maxrate	Maximum bitrate. (0=ignore)	4000000	0~20000000
would	aspect ratio	0(auto)	0 - auto 1 - 4:3 2 - 16:9 3 - none

(3) device alsa参数

The parameter name	Parameter interpretation	The default value	The value range
ac	Number of audio channels	2	2
ar	Audio sample rate	48000	up to 48000
i	Audio device	hw:0	hw:0

(4) audio3a parameter

The parameter name	Parameter interpretation	The default value	The value range
sample_rate	Audio sample rate	16000	1~65535
agc	Audio gain mode	3(AgcModeFixedDigital)	0 - AgcModeUnchanged 1 - AgcModeAdaptiveAnalog 2 - AgcModeAdaptiveDigital 3 - AgcModeFixedDigital
ns	Noise level	3(VeryHigh)	0 - Low 1 - Moderate 2 - High 3 - VeryHigh
dsp_task	Auido3a running position	1(dsp)	0 - cpu 1 - dsp

Configurable parameters can be viewed via the help command

ffmpeg -h encoder=libk510_h264 #查看k510编码器的参数
ffmpeg -h demuxer=v4l2 #查看demuxer的配置参数
ffmpeg -h filter=audio3a #查看audio3a的配置参数

The logical box for ffmpeg is as follows:

audio3a is used to perform 3a operations on the received audio and output it, and its logical block diagram is as follows:

3.2.1 Program Operation Instructions

3.2.1.1 rtp stream push

3.2.1.1.1. rtp push video stream

Example of a ffmpeg run command:

ffmpeg -f v4l2 -s 1920x1080 -conf "video_sample.conf" -isp 1 -buf_type 2 -r 30 -i /dev/video3 -vcodec libk510_h264 -an -f rtp rtp://10.102.231.29:1234

Where 10.102.231.29 is the receiving address, it is changed according to the actual situation. Press "q" while the program is running to stop running.

ffplay receives the command:

ffplay.exe -protocol_whitelist "file,udp,rtp" -i test.sdp -fflags nobuffer -analyzeduration 1000000 -flags low_delay

Test.sdp is configured as follows.

SDP:
v=0
o=- 0 0 IN IP4 127.0.0.1
s=No Name
c=IN IP4 10.102.231.29
t=0 0
a=tool:libavformat 58.76.100
m=video 1234 RTP/AVP 96
a=rtpmap:96 H264/90000
a=fmtp:96 packetization-mode=1

Description of the .sdp parameter:

c=: Media link information; IN: Network Type; IP4: Address type; Followed by the IP address (note that it is the IP address where the receiver is located, not the IP of the sender)
m= is the beginning of a media-level session, video:media type; 1234: Port number; RTP/AVP: Transport Protocol; 96: Payload format in the rtp header Modify the ip address and port number of the receiver according to the actual situation, and note that the port number of rtp must be even.

3.2.1.1.2. rtp push audio stream

Example of a ffmpeg run command:

ffmpeg -f alsa -ac 2 -ar 32000 -i hw:0 -acodec aac -f rtp rtp://10.100.232.11:1234

Where 10.100.232.11 is the receiving address, it is modified according to the actual situation.

ac: Sets the number of audio channels
ar: Sets the audio sample rate

The ffplay receive command is the same as receiving a video stream, and the sdp file refers to the following example.

SDP:
v=0
o=- 0 0 IN IP4 127.0.0.1
s=No Name
c=IN IP4 10.100.232.11
t=0 0
a=tool:libavformat 58.76.100
m=audio 1234 RTP/AVP 97
b=AS:128
a=rtpmap:97 MPEG4-GENERIC/32000/2
a=fmtp:97 profile-level-id=1;mode=AAC-hbr;sizelength=13;indexlength=3;indexdeltalength=3; config=129056E500

3.2.1.1.3 rtp push audio and video streams

Example of a ffmpeg run command:

ffmpeg -f v4l2 -s 1920x1080 -conf "video_sample.conf" -isp 1 -buf_type 2 -r 30 -i /dev/video3 -vcodec libk510_h264 -an -f rtp rtp://10.100.232.11:1234 -f alsa -ac 2 -ar 32000 -i hw:0 -acodec aac -vn -f rtp rtp://10.100.232.11:1236

The ffplay receive command is the same as receiving an audio stream, and the sdp file refers to the following example.

SDP:
v=0
o=- 0 0 IN IP4 127.0.0.1
s=No Name
t=0 0
a=tool:libavformat 58.76.100
m=video 1234 RTP/AVP 96
c=IN IP4 10.100.232.11
a=rtpmap:96 H264/90000
a=fmtp:96 packetization-mode=1
m=audio 1236 RTP/AVP 97
c=IN IP4 10.100.232.11
b=AS:128
a=rtpmap:97 MPEG4-GENERIC/32000/2
a=fmtp:97 profile-level-id=1;mode=AAC-hbr;sizelength=13;indexlength=3;indexdeltalength=3; config=129056E500

3.2.1.2 rtsp push stream

Before rtsp pushes the stream, you need to deploy the rtsp server to push the data stream to the server.

3.2.1.2.1 rtsp push video streams

Example of a ffmpeg run command:

ffmpeg -f v4l2 -s 1920x1080 -conf "video_sample.conf" -isp 1 -buf_type 2 -r 30 -i /dev/video3 -vcodec libk510_h264 -acodec copy -f rtsp rtsp://10.100.232.11:5544/live/test110

idr_freqFor the IDR frame interval, an integer multiple of the GOP is required. RTSP streams must generate IDR frames to pull to streams.
rtsp://10.100.232.11:5544/live/test110Is the push-pull stream URL address of the RTSP server

Example of a ffplay pull command:

ffplay.exe -protocol_whitelist "file,udp,rtp,tcp" -i rtsp://10.100.232.11:5544/live/test110

3.2.1.2.2 rtsp push audio stream

Example of a ffmpeg run command:

ffmpeg -f alsa -ac 2 -ar 32000 -i hw:0 -acodec aac -f rtsp rtsp://10.100.232.11:5544/live/test110

The ffplay pull stream command is the same as the rtsp pull video stream command.

3.2.1.2.3 rtsp push audio video stream

Example of a ffmpeg run command:

ffmpeg -f v4l2 -s 1920x1080 -conf "video_sample.conf" -isp 1 -buf_type 2 -r 30 -i /dev/video3 -f alsa -ac 2 -ar 32000 -i hw:0 -idr_freq 25 -vcodec libk510_h264 -acodec aac -f rtsp rtsp://10.100.232.11:5544/live/test110

The ffplay pull stream command is the same as the rtsp pull video stream command.

3.2.1.3 rtmp push stream

Before rtmp streaming, you need to deploy the rtmp server to push the data stream to the server. Servers that support the RTMP protocol include fms, nginx, srs, etc.

3.2.1.3.1 rtmp pushes video streams

Example of a ffmpeg run command:

ffmpeg -f v4l2 -s 1920x1080 -conf "video_sample.conf" -isp 1 -buf_type 2 -r 30 -i /dev/video3 -vcodec libk510_h264 -f flv rtmp://10.100.232.11/live/1

rtmp://10.100.232.11/live/1The URL address for pushing the stream to the rtmp server

Example of a ffplay pull command:

ffplay -fflags nobuffer rtmp://10.100.232.11/live/1

rtmp://10.100.232.11/live/1To pull the url address of the stream from the rtmp server (push streams are the same as the address of the pull stream), the -fflags nobuffer option to avoid increased latency due to player caching.

3.2.1.3.2 rtmp push audio stream

Example of a ffmpeg run command:

ffmpeg -f alsa -ac 2 -ar 32000 -i hw:0 -acodec aac -f flv rtmp://10.100.232.11/live/1

rtmp://10.100.232.11/live/1The URL address for pushing the stream to the rtmp server

The ffplay pull stream command is the same as the rtmp pull video stream command.

3.2.1.3.3 rtmp push audio and video streams

Example of a ffmpeg run command:

ffmpeg -f v4l2 -s 1920x1080 -conf "video_sample.conf" -isp 1 -buf_type 2 -r 30 -i /dev/video3 -f alsa -ac 2 -ar 32000 -i hw:0 -idr_freq 25 -vcodec libk510_h264 -acodec aac -f flv rtmp://10.100.232.11/live/1

rtmp://10.100.232.11/live/1The URL address for pushing the stream to the rtmp server

The ffplay pull stream command is the same as the rtmp pull video stream command.

3.2.1.4 audio3a

3.2.1.4.1 Run audio separately

(1) Run audio3a on the CPU Example of a ffmpeg run command:

ffmpeg -f alsa -ac 2 -ar 16000 -i hw:0 -af audio3a=sample_rate=16000:dsp_task=0 -f rtp rtp://10.100.232.11:1234

(2) Run audio3a on dsp Run two telnet windows, run dsp task scheduler and ffmpeg in both windows (run dsp task scheduler first) dsp task scheduler runs the command instance:

cd /app/dsp_app_new/
./dsp_app /app/dsp_scheduler/scheduler.bin

ffmpeg run command example:

ffmpeg -f alsa -ac 2 -ar 16000 -i hw:0 -af audio3a=sample_rate=16000 -f rtp rtp://10.100.232.11:1234

3.2.1.4.2 Run audio3a and video at the same time

(1) Run audio3a on the CPU Run two telnet windows, run audio3a and video in both windows. Example of the video command:

ffmpeg -f v4l2 -s 1920x1080 -conf "video_sample.conf" -isp 1 -buf_type 2 -r 30 -i /dev/video3 -vcodec libk510_h264 -an -f rtp rtp://10.100.232.11:1234

Example of the audio3a command:

ffmpeg -f alsa -ac 2 -ar 16000 -i hw:0 -af audio3a=sample_rate=16000:dsp_task=0 -acodec aac -vn -f rtp rtp://10.100.232.11:1236

Running audio3a and video on the cpu at the same time will produce overflow, it is recommended to run audio3a on dsp (2) Run audio3a on dsp Run three telnet windows, run audio3a calls, video, and dsp scheduler on each of the three windows The dsp task scheduler run command is the same as running audio3a alone.

Example of the audio3a command:

ffmpeg -f alsa -ac 2 -ar 16000 -i hw:0 -af audio3a=sample_rate=16000 -f rtp rtp://10.100.232.11:1236

Example of the video command:

ffmpeg -f v4l2 -s 1920x1080 -conf "video_sample.conf" -isp 1 -buf_type 2 -r 30 -i /dev/video3 -vcodec libk510_h264 -an -f rtp rtp://10.100.232.11:1234

10.100.232.11 is the IP address of the rtp receiver.
The contents of the SDP file of the receiving terminal ffplay can be obtained from the printed log after running the above ffmpeg command.

3.2.1.5 v4l2

Configurable parameters can be viewed via the help command

ffmpeg -h demuxer=v4l2 #查看v4l2的配置参数

The parameter name	Parameter interpretation	The default value	The value range
s	Image resolution, such as 1920x1080	NULL
r	Frame rate, currently only support 30fps	30	30
isp	Turn on the k510 ISP hardware	0	0-1
buf_type	v4l2 buffer`类型` 1: V4L2_MEMORY_MMAP: for -vcodec copy 2: V4L2_MEMORY_USERPTR: for -vcodec libk510_h264	1	1~2
Conf	v4l2 config file	NULL

Example of ffmpeg running command: where 10.100.232.11 is the receiving address, modified according to the actual situation.

ffmpeg -f v4l2 -s 1920x1080 -conf "video_sample.conf" -isp 1 -buf_type 2 -r 30 -i /dev/video3 -vcodec libk510_h264 -an -f rtp rtp://10.100.232.11:1234 -f alsa -ac 2 -ar 16000 -i hw:0 -acodec aac -vn -f rtp rtp://10.100.232.11:1236

ffmpeg -f v4l2 -s 1920x1080 -conf "video_sample.conf" -isp 1 -i /dev/video3 -vcodec copy -y out.yuv

Illustrate:

The runtime needs to be found in the run directoryvideo_sampe.conf, imx219_0.confand the imx219_1.conffiles are configured, and the three files are under/encode_app/ the directory.
The video that comes in real time by the camera is written as a YUV file, and because the YUV file is very large, the local DDR or NFS writing speed cannot keep up, which may cause frame drop.

3.2.1.6 JPEG encoding

File Output:

ffmpeg -f v4l2 -s 1920x1080 -conf "video_sample.conf" -isp 1 -buf_type 2 -r 30 -i /dev/video3 -vcodec libk510_jpeg -y test.mjpeg

Description: The runtime needs to be located in the run directoryvideo_sampe.conf, imx219_0.confand imx219_1.confthe files are configured, and the three files are under/encode_app/ the directory.

The output file test.mjpeg can be played on the PC side with ffplay

ffplay -i test.mjpeg

Push Stream:

ffmpeg -f v4l2 -s 1920x1080 -conf "video_sample.conf" -isp 1 -buf_type 2 -r 30 -i /dev/video3 -vcodec libk510_jpeg -an -f rtp rtp://10.100.232.11:1234

Ffplay pull streams are available

3.2.1.7 Multiplexing encoding

Support up to 8 simultaneous encoding, you can use the frame size of each channel multiplied by the frame rate and then added, do not exceed the amount of data of 1080p60, -vcodec can choose h264 or jpeg.

ffmpeg -f v4l2 -s 1920x1080 -conf "video_sample.conf" -isp 1 -buf_type 2 -r 30 -i /dev/video3 -filter_complex 'split=2[out1][out2]' -map '[out1]' -vcodec libk510_h264 -ch 0 -an -f rtp rtp://10.20.1.101:1234 -map '[out2]' -vcodec libk510_h264 -ch 1 -an -f rtp rtp://10.20.1.101:2236

ffmpeg -f v4l2 -s 480x360 -conf "video_sample.conf" -isp 1 -buf_type 2 -r 30 -i /dev/video3 -filter_complex 'split=8[out1][out2][out3][out4][out5][out6][out7][out8]' -map '[out1]' -vcodec libk510_h264 -b:v 300000 -ch 0 -an -f rtp rtp://10.20.1.101:1234 -map '[out2]' -vcodec libk510_h264 -b:v 300000 -ch 1 -an -f rtp rtp://10.20.1.101:2322 -map '[out3]' -vcodec libk510_h264 -b:v 300000 -ch 2 -an -f rtp rtp://10.20.1.101:3086 -map '[out4]' -vcodec libk510_h264 -b:v 300000 -ch 3 -an -f rtp rtp://10.20.1.101:4234 -map '[out5]' -vcodec libk510_h264 -b:v 300000 -ch 4 -an -f rtp rtp://10.20.1.101:5216 -map '[out6]' -vcodec libk510_h264 -b:v 300000 -ch 5 -an -f rtp rtp://10.20.1.101:6788 -map '[out7]' -vcodec libk510_h264 -b:v 300000 -ch 6 -an -f rtp rtp://10.20.1.101:7230 -map '[out8]' -vcodec libk510_h264 -b:v 300000 -ch 7 -an -f rtp rtp://10.20.1.101:8976

When using ffplay to pull streams, be careful to pull only one video, switch the video of other roads by changing the port number in the SDP file, or start multiple ffplay streams.

3.2.2 Program Porting Instructions

ffmpeg``ffmpegPorted on the open source version 4.4,xxx.patch added for the service pack

ff_libk510_h264_encoder: Control h264 hardware encoding, referencedlibvenc.so
ff_libk510_jpeg_encoder: Controls the jpeg hardware encoding, referencedlibvenc.so
v4l2: In v4l2.c, k510 hardware-related code was added, and the v4l2 buffer type V4L2_MEMORY_USERPTR and referencedlibmediactl.so.

3.2.2.1 patch generation command

（1）

quilt new -p ab xxx.patch #在patches目录下生成xxx.patch文件
quilt add <filename> #添加修改前的文件
### 修改代码 ###
quilt refresh #修改内容被添加到xxx.patch

（2） Copy xxx.patch to the package/ffmpeg_canaan directory and modify the file path in the patch file according to the current path.

mv ../../patches/xxx.patch ../../package/ffmpeg_canaan
rm ../../patches/series
sed -i "s/\/dl\/ffmpeg_canaan\/ffmpeg-4.4//g" ../../package/ffmpeg_canaan/xxx.patch

3.2.2.2 ffmpeg configuration

In the package/ffmpeg_canaan/ffmpeg.mkfile, the CPU core can be modified, the compilation toolchain, and the enable can be made through the configee optionff_k510_video_demuxer.ff_libk510_jpeg_encoder ff_libk510_h264_encoder

./configure \
    --cross-prefix=riscv64-linux- \
    --enable-cross-compile \
    --target-os=linux \
    --cc=riscv64-linux-gcc \
    --arch=riscv64 \
    --extra-ldflags="-L./" \
    --extra-ldflags="-ldl" \
    --extra-ldflags="-Wl,-rpath ." \
    --enable-static \
    --enable-libk510_video \
    --enable-libk510_h264 \
    --enable-libk510_jpeg \
    --enable-alsa \
    --disable-autodetect \
    --disable-ffplay \
    --disable-ffprobe \
    --disable-doc \
    --enalbe-audio3a \
    --enable-indev=v4l2 \

Translation Disclaimer
For the convenience of customers, Canaan uses an AI translator to translate text into multiple languages, which may contain errors. We do not guarantee the accuracy, reliability or timeliness of the translations provided. Canaan shall not be liable for any loss or damage caused by reliance on the accuracy or reliability of the translated information. If there is a content difference between the translations in different languages, the Chinese Simplified version shall prevail.

If you would like to report a translation error or inaccuracy, please feel free to contact us by mail.

Files

K510_Multimedia_Developer_Guides.md

Latest commit

History

K510_Multimedia_Developer_Guides.md

File metadata and controls

Revision history

1 Encoder API

1.1 Header File Description

1.2 API function descriptions

1.2.1 VideoEncoder_Create

1.2.2 VideoEncoder_SetRoiCfg

1.2.3 VideoEncoder_SetLongTerm

1.2.4 VideoEncoder_UseLongTerm

1.2.5 VideoEncoder_InsertUserData

1.2.6 VideoEncoder_Destory

1.2.7 VideoEncoder_EncodeOneFrame

1.2.8 VideoEncoder_GetStream

1.2.9 VideoEncoder_GetStream_ByExtBuf

1.3.0 VideoEncoder_ReleaseStream

2 Hardware structure diagram and software architecture

2.1 Hardware Structure Diagram

2.2 Software Architecture

3 Demo app

3.1 Encode Application

3.1.1 Enter the yuv file and output the file

3.1.2 Input v4l2, output rtsp push stream

3.1.2.1 Single Channel

3.1.2.2 Single camera dual channel

3.1.2.3 Dual Cameras

3.1.2.4 ROI test

3.1.3 Frame rate transformation

3.1.4 Multiple input frame rates

3.1.5 rtsp push audio and video streams

3.1.6 Precautions

3.2 ffmpeg

3.2.1 Program Operation Instructions

3.2.1.1 rtp stream push

3.2.1.1.1. rtp push video stream

3.2.1.1.2. rtp push audio stream

3.2.1.1.3 rtp push audio and video streams

3.2.1.2 rtsp push stream

3.2.1.2.1 rtsp push video streams

3.2.1.2.2 rtsp push audio stream

3.2.1.2.3 rtsp push audio video stream

3.2.1.3 rtmp push stream

3.2.1.3.1 rtmp pushes video streams

3.2.1.3.2 rtmp push audio stream

3.2.1.3.3 rtmp push audio and video streams

3.2.1.4 audio3a

3.2.1.4.1 Run audio separately

3.2.1.4.2 Run audio3a and video at the same time

3.2.1.5 v4l2

3.2.1.6 JPEG encoding

3.2.1.7 Multiplexing encoding

3.2.2 Program Porting Instructions

3.2.2.1 patch generation command

3.2.2.2 ffmpeg configuration