Video Codec

Note

Cmd operations and test results presented in this chapter are based on the AIoT Yocto v22.0 and MT8365 P1V3 (i350-EVK) board.

Video Processing Overview

On AIoT Yocto, video encoder, decoder, and format conversion hardware provide the V4L2 interface to userspace programs. GStreamer is integrated to provide wrapper plugins over the V4L2 interface and to assist in setting up video processing pipelines.

Example: Video Playback Using GStreamer

The following examples use GStreamer v4l2h264dec plug-in for hardware-accelerated video decoding. The v4l2convert plug-in is mandatory. This is explained in the sections below.

gst-launch-1.0 -v filesrc location=<your-video-path> ! parsebin ! v4l2h264dec ! v4l2convert output-io-mode=dmabuf-import capture-io-mode=dmabuf ! waylandsink

Note

The V4L2 video decoder assumes that one bitstream buffer must contains a complete frame data. The default input bitstream buffer size of Gstreamer is 2MB. It might be some playback issues on high bitrate video (e.g. 4K 60Mbps) if 2MB is not enough for whole frame data. Applications should handle the buffer allocation by themself.

The Colorimetry Issue of v4l2convert

Sometimes, the gstreamer decoding pipeline will get failed due to not supported “colorimetry”.

gst-launch-1.0 -v filesrc location=/mnt/out-320x240-nv12.avi ! parsebin ! v4l2h264dec ! v4l2convert ! waylandsink
Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
/GstPipeline:pipeline0/GstParseBin:parsebin0/GstTypeFindElement:typefind.GstPad:src: caps = video/x-msvideo
/GstPipeline:pipeline0/GstParseBin:parsebin0/GstTypeFindElement:typefind.GstPad:src: caps = NULL
/GstPipeline:pipeline0/GstParseBin:parsebin0/GstH264Parse:h264parse0.GstPad:sink: caps = video/x-h264, variant=(string)itu, framerate=(fraction)30/1, width=(int)320, height=(int)240
/GstPipeline:pipeline0/GstParseBin:parsebin0/GstH264Parse:h264parse0.GstPad:src: caps = video/x-h264, variant=(string)itu, framerate=(fraction)30/1, width=(int)320, height=(int)240, chroma-format=(string)4:2:0, bit-depth-luma=(uint)8, bit-depth-chroma=(uint)8, colorimetry=(string)2:4:16:3, parsed=(boolean)true, stream-format=(string)byte-stream, alignment=(string)au, profile=(string)baseline, level=(string)1.3
ERROR: from element /GstPipeline:pipeline0/GstParseBin:parsebin0/GstAviDemux:avidemux0: Internal data stream error.
Additional debug info:
../gst-plugins-good-1.20.3/gst/avi/gstavidemux.c(5798): gst_avi_demux_loop (): /GstPipeline:pipeline0/GstParseBin:parsebin0/GstAviDemux:avidemux0:
streaming stopped, reason not-negotiated (-4)
ERROR: pipeline doesn't want to preroll.
Setting pipeline to NULL ...
Freeing pipeline ...

This is a known issue with GStreamer v4l2convert element regarding colorimetry. The gstreamer v4l2convert ended up reducing to a “well known” set of colorspace, but then whenever you do something that isn’t in the subset, in this case, like 2:4:16:3 (reduced range, BT601, BT601, BT470BG), it fails to negotiate.

You can use a caps setter to workaround.

gst-launch-1.0 -v filesrc location=/mnt/out-320x240-nv12.avi ! parsebin ! capssetter replace=true caps="video/x-h264, variant=(string)itu, framerate=(fraction)30/1, width=(int)320, height=(int)240, chroma-format=(string)4:2:0, bit-depth-luma=(uint)8, bit-depth-chroma=(uint)8, colorimetry=(string)bt601, parsed=(boolean)true, stream-format=(string)byte-stream, alignment=(string)au, profile=(string)constrained-baseline, level=(string)1.3" ! v4l2h264dec ! v4l2convert output-io-mode=dmabuf-import capture-io-mode=dmabuf ! waylandsink

Example: Video Encoding Using GStreamer

The following examples use GStreamer v4l2h264enc plug-in for hardware-accelerated video encoding.

gst-launch-1.0 -v videotestsrc num-buffers=300 ! queue ! video/x-raw,framerate=30/1,width=1920,height=1080,format=NV12 ! v4l2h264enc ! h264parse ! mp4mux ! filesink location=out_1920x1080.mp4

Note

The V4L2 video encoder assumes that output bitstream buffer is big enough for a complete frame data. The encoder will return error when buffer full. Applications should handle the buffer allocation by themself.

The GStreamer framework provides software-based or V4L2 hardware-accelerated video processing. To see the list of V4L2 video codecs available on GStreamer, use the following command:

gst-inspect-1.0 | grep v4l2.*
video4linux2:  v4l2src: Video (video4linux2) Source
video4linux2:  v4l2sink: Video (video4linux2) Sink
video4linux2:  v4l2radio: Radio (video4linux2) Tuner
video4linux2:  v4l2deviceprovider (GstDeviceProviderFactory)
video4linux2:  v4l2convert: V4L2 Video Converter
video4linux2:  v4l2mpeg4dec: V4L2 MPEG4 Decoder
video4linux2:  v4l2video0mpeg4dec: V4L2 MPEG4 Decoder
video4linux2:  v4l2h264dec: V4L2 H264 Decoder
video4linux2:  v4l2h265dec: V4L2 H265 Decoder
video4linux2:  v4l2vp8dec: V4L2 VP8 Decoder
video4linux2:  v4l2vp9dec: V4L2 VP9 Decoder
video4linux2:  v4l2h264enc: V4L2 H.264 Encoder

Video Codec Devices and V4L2 Interface

The hardware video decoder and encoder support V4L2 API in AIoT Yocto. To check V4L2 devices in the console, run the following commands:

ls -l /sys/class/video4linux/
lrwxrwxrwx 1 root root 0 Sep 20 10:43 video0 -> ../../devices/platform/soc/16000000.codec/video4linux/video0
lrwxrwxrwx 1 root root 0 Sep 20 10:43 video1 -> ../../devices/platform/soc/17020000.codec/video4linux/video1
lrwxrwxrwx 1 root root 0 Sep 20 10:43 video2 -> ../../devices/platform/soc/14004000.mdp_rdma0/video4linux/video2

Another utility to enumerate the v4l2 devices is v4l2-sysfs-path:

v4l2-sysfs-path
Video device: video2
Video device: video0
Video device: video1
Alsa playback device(s): hw:0,0 hw:0,1

You can also use v4l2-dbg -D -d <device#> to query information about each V4L2 video device, for example:

v4l2-dbg -D -d 0
Driver info:
        Driver name   : mtk-vcodec-dec
        Card type     : platform:mt8167
        Bus info      : platform:mt8167
        Driver version: 5.10.73
        Capabilities  : 0x84204000
                Video Memory-to-Memory Multiplanar
                Streaming
                Extended Pix Format
                Device Capabilities
v4l2-dbg -D -d 1
Driver info:
        Driver name   : mtk-vcodec-enc
        Card type     : platform:mt8167
        Bus info      : platform:mt8167
        Driver version: 5.10.73
        Capabilities  : 0x84204000
                Video Memory-to-Memory Multiplanar
                Streaming
                Extended Pix Format
                Device Capabilities
v4l2-dbg -D -d 2
Driver info:
        Driver name   : mtk-mdp
        Card type     : 14004000.mdp_rdma0
        Bus info      : platform:mt8173
        Driver version: 5.10.73
        Capabilities  : 0x84204000
                Video Memory-to-Memory Multiplanar
                Streaming
                Extended Pix Format
                Device Capabilities

As shown in the example above, there are 3 device nodes related to video codec:

  1. Video Decoder (/dev/video0 and /sys/devices/platform/soc/16000000.codec/video4linux/video0)

  2. Video Encoder (/dev/video1 and /sys/devices/platform/soc/17020000.codec/video4linux/video1)

  3. MDP (/dev/video2 and /sys/devices/platform/soc/14004000.mdp_rdma0/video4linux/video2)

All three devices are M2M (memory-to-memory) devices.

The userspace clients should access these devices through the V4L2 userspace API. AIoT Yocto integrates the GStreamer framework, which provides V4L2 plugins for evaluation and application development.

Note

The video decoder device cannot decode into YUYV or NV12 formats directly. It can only decode the bitstream into a proprietary format. Please refer to the sections below to convert the proprietary format to the buffer format you require.

Output Format of Video Decoder

One thing worth notice is that the output buffer format of the video decoder device is a proprietary format. This can be observed with the following commands:

v4l2-ctl --list-formats -d 0
ioctl: VIDIOC_ENUM_FMT
    Type: Video Capture Multiplanar

    [0]: 'MT21' (Mediatek Compressed Format, compressed)
    [1]: 'MM21' (Mediatek block Format, compressed)

To see other information such as accepted bitstream format, please add --all parameter:

v4l2-ctl --all -d 0
Driver Info:
        Driver name      : mtk-vcodec-dec
        Card type        : platform:mt8167
        Bus info         : platform:mt8167
        Driver version   : 5.10.73
        Capabilities     : 0x84204000
                Video Memory-to-Memory Multiplanar
                Streaming
                Extended Pix Format
                Device Capabilities
        Device Caps      : 0x04204000
                Video Memory-to-Memory Multiplanar
                Streaming
                Extended Pix Format
Priority: 2
Format Video Capture Multiplanar:
        Width/Height      : 64/64
        Pixel Format      : 'MT21' (Mediatek Compressed Format)
        Field             : None
        Number of planes  : 2
        Flags             :
        Colorspace        : Rec. 709
        Transfer Function : Default
        YCbCr/HSV Encoding: Default
        Quantization      : Default
        Plane 0           :
        Bytes per Line : 64
        Size Image     : 4096
        Plane 1           :
        Bytes per Line : 64
        Size Image     : 2048
Format Video Output Multiplanar:
        Width/Height      : 64/64
        Pixel Format      : 'H264' (H.264)
        Field             : None
        Number of planes  : 1
        Flags             :
        Colorspace        : Rec. 709
        Transfer Function : Default
        YCbCr/HSV Encoding: Default
        Quantization      : Default
        Plane 0           :
        Bytes per Line : 0
        Size Image     : 1048576
Selection Video Capture: compose, Left 0, Top 0, Width 64, Height 64, Flags:
Selection Video Capture: compose_default, Left 0, Top 0, Width 64, Height 64, Flags:
Selection Video Capture: compose_bounds, Left 0, Top 0, Width 64, Height 64, Flags:

User Controls

min_number_of_capture_buffers 0x00980927 (int)    : min=0 max=32 step=1 default=1 value=0 flags=read-only, volatile

Note

Please note that the term Format Video Capture means the format of a capture device, which produces buffers. On the contrary, the term Format Video Output means the format of a video output device, which takes buffers as inputs.

Therefore, for a M2M device like the decoder,

  • the Video Output format is the input buffer format of the decoder device.

  • the Video Capture format is the output buffer format of the decoder device.

MDP and Format Conversion

The proprietary MT21 or MM21 format cannot be decoded by software converters and must be passed to the MDP device. Therefore, a playback video pipeline always consists of video decoder hardware and MDP hardware.

The MDP device is also capable of resizing video frames and converting buffer pixel formats, the supported formats can be listed by the v4l2-ctl command:

v4l2-ctl --list-formats -d 2
ioctl: VIDIOC_ENUM_FMT
    Type: Video Capture Multiplanar

    [0]: 'NM12' (Y/CbCr 4:2:0 (N-C))
    [1]: 'NV12' (Y/CbCr 4:2:0)
    [2]: 'NM21' (Y/CrCb 4:2:0 (N-C))
    [3]: 'NV21' (Y/CrCb 4:2:0)
    [4]: 'YM21' (Planar YVU 4:2:0 (N-C))
    [5]: 'YM12' (Planar YUV 4:2:0 (N-C))
    [6]: 'YV12' (Planar YVU 4:2:0)
    [7]: 'YU12' (Planar YUV 4:2:0)
    [8]: '422P' (Planar YUV 4:2:2)
    [9]: 'NV16' (Y/CbCr 4:2:2)
    [10]: 'NM16' (Y/CbCr 4:2:2 (N-C))
    [11]: 'YUYV' (YUYV 4:2:2)
    [12]: 'UYVY' (UYVY 4:2:2)
    [13]: 'YVYU' (YVYU 4:2:2)
    [14]: 'VYUY' (VYUY 4:2:2)
    [15]: 'BA24' (32-bit ARGB 8-8-8-8)
    [16]: 'AR24' (32-bit BGRA 8-8-8-8)
    [17]: 'BX24' (32-bit XRGB 8-8-8-8)
    [18]: 'XR24' (32-bit BGRX 8-8-8-8)
    [19]: 'RGBP' (16-bit RGB 5-6-5)
    [20]: 'RGB3' (24-bit RGB 8-8-8)
    [21]: 'BGR3' (24-bit BGR 8-8-8)

The v4l2convert plug-in in GStreamer framework conveniently wraps the format conversion and resizing capabilities of MDP.

VPUD Daemon

Although the video devices are accessible through V4L2 interfaces, the kernel driver of video processing hardware delegates most of the hardware configuration logics to userspace daemons. These daemon is:

  • vpud serves the video encoder and decoder drivers.

The daemon do not provide interfaces to other userspace clients. It only works with the kernel driver. All the video processing functionalities should be accessed through the V4L2 interface on AIoT Yocto.

Therefore, the video processing drivers stop working if the vpud processe is not initialized or stopped.

On AIoT Yocto, the daemon is launched during the system boot process.

Software vs. Hardware Decoder and Converter

On AIoT Yocto, it provides VCODEC and MDP, which are hardware components to accelerate the video pipeline. You can still use software components to process the video, but the performance may be terrible due to the CPU performance. In this section, there are software and hardware samples for you to compare the framerate influence. The scenario is to decode a 720P/30FPS video, convert it to a 1080P/30FPS video, and then show it on the screen. fpsdisplaysink is used to calculate the framerate.

To use the software method:

gst-launch-1.0 -v filesrc location=<your-video-path> ! parsebin ! avdec_h264 ! video/x-raw,width=1280,height=720 ! \
videoscale ! video/x-raw,width=1920,height=1080 ! fpsdisplaysink video-sink=waylandsink text-overlay=false
...
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0/GstTextOverlay:fps-display-text-overlay: text = rendered: 181, dropped: 0, current: 18.74, average: 18.92
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 181, dropped: 0, current: 18.74, average: 18.92

To use the hardware method:

gst-launch-1.0 -v filesrc location=<your-video-path> ! parsebin ! v4l2h264dec ! video/x-raw,width=1280,height=720 ! \
v4l2convert output-io-mode=5 ! video/x-raw,width=1920,height=1080 ! fpsdisplaysink video-sink=waylandsink text-overlay=false
...
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0/GstTextOverlay:fps-display-text-overlay: text = rendered: 268, dropped: 1, current: 27.95, average: 27.80
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 268, dropped: 1, current: 27.95, average: 27.80

The average framerate of the software method is about 18.92 FPS while the average framerate of the hardware method is 27.80 FPS. The heavier loading it takes, the more difference it creates.

Note

When using fpsdisplaysink to check performance, please add ‘text-overlay=false’ to prevent drawing FPS information on the display overlay. It might cost a lot of CPU computing power.

Video Encoder Extra-Controls

As a V4L2 video encoder, mtk-vcodec-enc also provides extra-controls to set encoder capabilities.

extra-controls of mtk-vcodec-enc

CID

Command(String)

Value

Default Value

Note

V4L2_CID_MPEG_VIDEO_BITRATE

video_bitrate

1~20000000

20000000

V4L2_CID_MPEG_VIDEO_GOP_SIZE

video_gop_size

0~65535

0

size 0 means I-VOP only

V4L2_CID_MPEG_VIDEO_FORCE_KEY_FRAME

force_key_frame

0~0

0

to force set I-VOP on the next output frame

V4L2_CID_MPEG_VIDEO_HEADER_MODE

sequence_header_mode

0~1

1

0: seperate mode, 1: joined-with-1st-frame mode.

V4L2_CID_MPEG_VIDEO_H264_PROFILE

h264_profile

0, 2, 4

4

0: BASELINE, 2: MAIN, 4: HIGH

V4L2_CID_MPEG_VIDEO_H264_LEVEL

h264_level

0, 2~13

11

support LEVEL_1_0~LEVEL_4_2, exclude LEVEL_1B)

Note

Gstreamer is not fully support video header mode V4L2_MPEG_VIDEO_HEADER_MODE_SEPARATE.

For example, to compress a H.264 main profile and level 4.1 video bitstream with 512kbps bitrate:

gst-launch-1.0 -v videotestsrc num-buffers=300 ! "video/x-raw,format=NV12, width=720, height=480, framerate=30/1"  ! v4l2h264enc extra-controls="cid,video_gop_size=30,video_bitrate=512000,sequence_header_mode=1" ! "video/x-h264,level=(string)4.1,profile=main" ! h264parse ! mp4mux ! filesink location=/tmp/test-h264.mp4
...
Execution ended after 0:00:01.554987154
Setting pipeline to NULL ...
Freeing pipeline ...

Note

To modify profile & level, please set it via gst-caps. If set by extra-controls directly, The profile & level will be overridden during gst caps negotiation.