Video Codec
Note
Command operations and test results presented in this chapter are based on the IoT Yocto v22.0 and Genio 350-EVK.
Video Processing Overview
On IoT Yocto, video encoder, decoder, and format conversion hardware provide the V4L2 interface to userspace programs. GStreamer is integrated to provide wrapper plugins over the V4L2 interface and to assist in setting up video processing pipelines.
Example: Video Playback Using GStreamer
The following examples use GStreamer v4l2h264dec
plug-in for hardware-accelerated video decoding.
The v4l2convert
plug-in is mandatory. This is explained in the sections below.
gst-launch-1.0 -v filesrc location=<your-video-path> ! parsebin ! v4l2h264dec ! v4l2convert output-io-mode=dmabuf-import capture-io-mode=dmabuf ! waylandsink
Note
The V4L2 video decoder assumes that one bitstream buffer must contains a complete frame data. The default input bitstream buffer size of GStreamer is 2MB. It might be some playback issues on high bitrate video (e.g. 4K 60Mbps) if 2MB is not enough for whole frame data. Applications should handle the buffer allocation by themselves.
Example: Multi-Display Video Playback Using GStreamer
In case of dual and triple display systems, the displays act as a contiguous video plane. One can make use of glvideomixer to play multiple videos on different sections of this “video plane”.
For example, if we use HDMI + DP, we can play two videos on the two displays as follows:
Check supported resolutions:
For HDMI:
cat /sys/class/drm/card0-HDMI-A-1/modes
For DP:
cat /sys/class/drm/card0-DP-1/modes
If we are connected to two 4k monitors, the first value in supported modes will be 3840x2160.
For this case, we can use glvideomixer
as follows:
gst-launch-1.0 -v \
glvideomixer name=mix background=0 \
sink_1::xpos=0 sink_1::ypos=0 sink_1::width=3840 sink_1::height=2160 \
sink_2::xpos=3840 sink_2::ypos=0 sink_2::width=3840 sink_2::height=2160 \
! queue ! fpsdisplaysink "video-sink=glimagesink rotate-method=0 render-rectangle=<0,0,7680,2160>" text-overlay=false \
filesrc location=4k30_1.mp4 \
! queue ! parsebin ! queue ! v4l2h264dec capture-io-mode=dmabuf ! queue ! v4l2convert output-io-mode=dmabuf-import capture-io-mode=dmabuf ! video/x-raw,width=3840,height=2160,format=BGRA,pixel-aspect-ratio=1 \
! queue ! mix.sink_1 \
filesrc location=4k30_2.mp4 \
! queue ! parsebin ! queue ! v4l2h264dec capture-io-mode=dmabuf ! queue ! v4l2convert output-io-mode=dmabuf-import capture-io-mode=dmabuf ! video/x-raw,width=3840,height=2160,format=BGRA,pixel-aspect-ratio=1 \
! queue ! mix.sink_2
Here,
sink_1::xpos
and sink_1::ypos
are the starting coordinate of the first video.
These would always be (0,0) unless we want the first video to be displayed at an offset.
sink_1::width
and sink_1::height
are the window sizes for the first video.
If we want to display a 4k video on the first 4k monitor, we can set these as 3840x2160.
sink_2::xpos
and sink_2::ypos
are the starting coordinates for the second video.
We define an offset of 3840 for the xpos
of the second video to horizontally stack it to the first video.
sink_2::width
and sink_2::height
are the window sizes for the second video.
glimagesink renders video frames to a drawable on a local or remote display using OpenGL, which supports the memory::GLMemory
memory type.
rotate-method
and render-rectangle
are the rotation and resizing options of glimagesink
.
The size of the entire window would be 7680x2160 for two 4k videos which is the input to glimagesink
.
We then provide two mp4 4k video files as source to the sinks.
We can similarly stack a third sink for a triple display configuration.
Note
The concept of displaying multiple videos using glvideomixer
is not limited to multi-display systems.
For a single display with a size of 1280x720, we can display two videos of 640x720 side by side..
Note
For improved performance, the input frames of glvideomixer
should be converted to an RGB-based format to optimize GPU utilization.
Additionally, do not change the memory type from memory::GLMemory
that is used between glvideomixer
and glimagesink
.
Otherwise, it will involve video texture download and upload instructions.
Example: Video Encoding Using GStreamer
The following examples use GStreamer v4l2h264enc
plug-in for hardware-accelerated video encoding.
gst-launch-1.0 -v videotestsrc num-buffers=300 ! queue ! video/x-raw,framerate=30/1,width=1920,height=1080,format=NV12 ! v4l2h264enc ! h264parse ! mp4mux ! filesink location=out_1920x1080.mp4
Note
The V4L2 video encoder assumes that output bitstream buffer is big enough for a complete frame data. The encoder will return error when buffer full. Applications should handle the buffer allocation by themselves.
The GStreamer framework provides software-based or V4L2 hardware-accelerated video processing. To see the list of V4L2 video codecs available on GStreamer, use the following command:
gst-inspect-1.0 | grep v4l2.*
video4linux2: v4l2src: Video (video4linux2) Source
video4linux2: v4l2sink: Video (video4linux2) Sink
video4linux2: v4l2radio: Radio (video4linux2) Tuner
video4linux2: v4l2deviceprovider (GstDeviceProviderFactory)
video4linux2: v4l2convert: V4L2 Video Converter
video4linux2: v4l2mpeg4dec: V4L2 MPEG4 Decoder
video4linux2: v4l2video0mpeg4dec: V4L2 MPEG4 Decoder
video4linux2: v4l2h264dec: V4L2 H264 Decoder
video4linux2: v4l2h265dec: V4L2 H265 Decoder
video4linux2: v4l2vp8dec: V4L2 VP8 Decoder
video4linux2: v4l2vp9dec: V4L2 VP9 Decoder
video4linux2: v4l2h264enc: V4L2 H.264 Encoder
The Colorimetry Issue of v4l2convert
Sometimes, the GStreamer decoding pipeline will fail due to not supporting “colorimetry”.
gst-launch-1.0 -v filesrc location=/mnt/out-320x240-nv12.avi ! parsebin ! v4l2h264dec ! v4l2convert ! waylandsink
Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
/GstPipeline:pipeline0/GstParseBin:parsebin0/GstTypeFindElement:typefind.GstPad:src: caps = video/x-msvideo
/GstPipeline:pipeline0/GstParseBin:parsebin0/GstTypeFindElement:typefind.GstPad:src: caps = NULL
/GstPipeline:pipeline0/GstParseBin:parsebin0/GstH264Parse:h264parse0.GstPad:sink: caps = video/x-h264, variant=(string)itu, framerate=(fraction)30/1, width=(int)320, height=(int)240
/GstPipeline:pipeline0/GstParseBin:parsebin0/GstH264Parse:h264parse0.GstPad:src: caps = video/x-h264, variant=(string)itu, framerate=(fraction)30/1, width=(int)320, height=(int)240, chroma-format=(string)4:2:0, bit-depth-luma=(uint)8, bit-depth-chroma=(uint)8, colorimetry=(string)2:4:16:3, parsed=(boolean)true, stream-format=(string)byte-stream, alignment=(string)au, profile=(string)baseline, level=(string)1.3
ERROR: from element /GstPipeline:pipeline0/GstParseBin:parsebin0/GstAviDemux:avidemux0: Internal data stream error.
Additional debug info:
../gst-plugins-good-1.20.3/gst/avi/gstavidemux.c(5798): gst_avi_demux_loop (): /GstPipeline:pipeline0/GstParseBin:parsebin0/GstAviDemux:avidemux0:
streaming stopped, reason not-negotiated (-4)
ERROR: pipeline doesn't want to preroll.
Setting pipeline to NULL ...
Freeing pipeline ...
This is a known issue with the GStreamer v4l2convert element regarding colorimetry. The GStreamer v4l2convert ended up reduced to a “well-known” set of color spaces, but then whenever you do something that isn’t in the subset, in this case, like 2:4:16:3 (reduced range, BT601, BT601, BT470BG), it fails to negotiate.
You can use a caps setter to workaround.
gst-launch-1.0 -v filesrc location=/mnt/out-320x240-nv12.avi ! parsebin ! capssetter replace=true caps="video/x-h264, variant=(string)itu, framerate=(fraction)30/1, width=(int)320, height=(int)240, chroma-format=(string)4:2:0, bit-depth-luma=(uint)8, bit-depth-chroma=(uint)8, colorimetry=(string)bt601, parsed=(boolean)true, stream-format=(string)byte-stream, alignment=(string)au, profile=(string)constrained-baseline, level=(string)1.3" ! v4l2h264dec ! v4l2convert output-io-mode=dmabuf-import capture-io-mode=dmabuf ! waylandsink
Sometimes, the GStreamer pipe got negotiation fail due to the different colorimetry requirements of upstream and downstream elements of v4l2convert.
For example, in the below camera case, the v4l2h264enc
requires bt709 colorimetry but v4l2src
output video frames only with bt601 colorimetry.
It will fail to establish a connection due to that the GStreamer v4l2convert
plugin cannot do colorimetry conversion.
gst-launch-1.0 -v v4l2src device="/dev/video5" ! video/x-raw,width=3840,height=2160,format=UYVY ! v4l2convert output-io-mode=dmabuf-import ! \
v4l2h264enc ! queue ! video/x-h264 ! h264parse ! v4l2h264dec ! autovideosink
...
Setting pipeline to PAUSED ...
Pipeline is live and does not need PREROLL ...
Got context from element 'autovideosink0': gst.gl.GLDisplay=context, gst.gl.GLDisplay=(GstGLDisplay)"\(GstGLDisplayWayland\)\ gldisplaywayland0";
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
ERROR: from element /GstPipeline:pipeline0/v4l2convert:v4l2convert0: Device '/dev/video2' does not support bt709 colorimetry
Additional debug info:
../gst-plugins-good-1.20.5/sys/v4l2/gstv4l2object.c(4234): gst_v4l2_object_set_format_full (): /GstPipeline:pipeline0/v4l2convert:v4l2convert0:
Device wants 2:4:5:4 colorimetry
Execution ended after 0:00:05.204833693
Setting pipeline to NULL ...
ERROR: from element /GstPipeline:pipeline0/v4l2convert:v4l2convert0: Device '/dev/video2' does not support bt709 colorimetry
Additional debug info:
../gst-plugins-good-1.20.5/sys/v4l2/gstv4l2object.c(4234): gst_v4l2_object_set_format_full (): /GstPipeline:pipeline0/v4l2convert:v4l2convert0:
Device wants 2:4:5:4 colorimetry
Freeing pipeline ...
Here is a capssetter
workaround to fix this issue:
gst-launch-1.0 -v v4l2src device="/dev/video5" ! video/x-raw,width=3840,height=2160,format=UYVY ! v4l2convert output-io-mode=dmabuf-import ! \
`capssetter caps=\"video/x-raw,colorimetry=bt601\"` ! v4l2h264enc ! queue ! video/x-h264 ! h264parse ! v4l2h264dec ! autovideosink
Video Codec Devices and V4L2 Interface
The hardware video decoder and encoder support V4L2 API in IoT Yocto. To check V4L2 devices in the console, run the following commands:
ls -l /sys/class/video4linux/
lrwxrwxrwx 1 root root 0 Sep 20 10:43 video0 -> ../../devices/platform/soc/16000000.codec/video4linux/video0
lrwxrwxrwx 1 root root 0 Sep 20 10:43 video1 -> ../../devices/platform/soc/17020000.codec/video4linux/video1
lrwxrwxrwx 1 root root 0 Sep 20 10:43 video2 -> ../../devices/platform/soc/14004000.mdp_rdma0/video4linux/video2
Another utility to enumerate the v4l2 devices is v4l2-sysfs-path
:
v4l2-sysfs-path
Video device: video2
Video device: video0
Video device: video1
Alsa playback device(s): hw:0,0 hw:0,1
You can also use v4l2-dbg -D -d <device#>
to query information about each V4L2 video device, for example:
v4l2-dbg -D -d 0
Driver info:
Driver name : mtk-vcodec-dec
Card type : platform:mt8167
Bus info : platform:mt8167
Driver version: 5.10.73
Capabilities : 0x84204000
Video Memory-to-Memory Multiplanar
Streaming
Extended Pix Format
Device Capabilities
v4l2-dbg -D -d 1
Driver info:
Driver name : mtk-vcodec-enc
Card type : platform:mt8167
Bus info : platform:mt8167
Driver version: 5.10.73
Capabilities : 0x84204000
Video Memory-to-Memory Multiplanar
Streaming
Extended Pix Format
Device Capabilities
v4l2-dbg -D -d 2
Driver info:
Driver name : mtk-mdp
Card type : 14004000.mdp_rdma0
Bus info : platform:mt8173
Driver version: 5.10.73
Capabilities : 0x84204000
Video Memory-to-Memory Multiplanar
Streaming
Extended Pix Format
Device Capabilities
As shown in the example above, there are 3 device nodes related to video codec:
Video Decoder (
/dev/video0
and/sys/devices/platform/soc/16000000.codec/video4linux/video0
)Video Encoder (
/dev/video1
and/sys/devices/platform/soc/17020000.codec/video4linux/video1
)MDP (
/dev/video2
and/sys/devices/platform/soc/14004000.mdp_rdma0/video4linux/video2
)
All three devices are M2M (memory-to-memory) devices.
The userspace clients should access these devices through the V4L2 userspace API. IoT Yocto integrates the GStreamer framework, which provides V4L2 plugins for evaluation and application development.
Note
The video decoder device cannot decode into YUYV or NV12 formats directly. It can only decode the bitstream into a proprietary format. Please refer to the sections below to convert the proprietary format to the buffer format you require.
Output Format of Video Decoder
One thing worth notice is that the output buffer format of the video decoder device is a proprietary format. This can be observed with the following commands:
v4l2-ctl --list-formats -d 0
ioctl: VIDIOC_ENUM_FMT
Type: Video Capture Multiplanar
[0]: 'MT21' (Mediatek Compressed Format, compressed)
[1]: 'MM21' (Mediatek block Format, compressed)
To see other information such as accepted bitstream format, please add --all
parameter:
v4l2-ctl --all -d 0
Driver Info:
Driver name : mtk-vcodec-dec
Card type : platform:mt8167
Bus info : platform:mt8167
Driver version : 5.10.73
Capabilities : 0x84204000
Video Memory-to-Memory Multiplanar
Streaming
Extended Pix Format
Device Capabilities
Device Caps : 0x04204000
Video Memory-to-Memory Multiplanar
Streaming
Extended Pix Format
Priority: 2
Format Video Capture Multiplanar:
Width/Height : 64/64
Pixel Format : 'MT21' (Mediatek Compressed Format)
Field : None
Number of planes : 2
Flags :
Colorspace : Rec. 709
Transfer Function : Default
YCbCr/HSV Encoding: Default
Quantization : Default
Plane 0 :
Bytes per Line : 64
Size Image : 4096
Plane 1 :
Bytes per Line : 64
Size Image : 2048
Format Video Output Multiplanar:
Width/Height : 64/64
Pixel Format : 'H264' (H.264)
Field : None
Number of planes : 1
Flags :
Colorspace : Rec. 709
Transfer Function : Default
YCbCr/HSV Encoding: Default
Quantization : Default
Plane 0 :
Bytes per Line : 0
Size Image : 1048576
Selection Video Capture: compose, Left 0, Top 0, Width 64, Height 64, Flags:
Selection Video Capture: compose_default, Left 0, Top 0, Width 64, Height 64, Flags:
Selection Video Capture: compose_bounds, Left 0, Top 0, Width 64, Height 64, Flags:
User Controls
min_number_of_capture_buffers 0x00980927 (int) : min=0 max=32 step=1 default=1 value=0 flags=read-only, volatile
Note
Please note that the term Format Video Capture
means the format of a capture device, which produces buffers.
On the contrary, the term Format Video Output
means the format of a video output device, which takes buffers as inputs.
Therefore, for a M2M device like the decoder,
the Video Output format is the input buffer format of the decoder device.
the Video Capture format is the output buffer format of the decoder device.
Interlaced Content Support
The video decoder
mtk-vcodec-dec
outputs de-interlaced(progressive) frames on interlaced formats. It performs automatically without any extra control.The video encoder
mtk-vcodec-enc
does not support interlaced video encoding, only performs on input frames (no field mode).
MDP and Format Conversion
The proprietary MT21
or MM21
format cannot be decoded by software converters and must be passed to the MDP device.
Therefore, a playback video pipeline always consists of video decoder hardware and MDP hardware.
The MDP device is also capable of resizing video frames and converting buffer pixel formats, the supported formats can be listed
by the v4l2-ctl
command:
v4l2-ctl --list-formats -d 2
ioctl: VIDIOC_ENUM_FMT
Type: Video Capture Multiplanar
[0]: 'NM12' (Y/CbCr 4:2:0 (N-C))
[1]: 'NV12' (Y/CbCr 4:2:0)
[2]: 'NM21' (Y/CrCb 4:2:0 (N-C))
[3]: 'NV21' (Y/CrCb 4:2:0)
[4]: 'YM21' (Planar YVU 4:2:0 (N-C))
[5]: 'YM12' (Planar YUV 4:2:0 (N-C))
[6]: 'YV12' (Planar YVU 4:2:0)
[7]: 'YU12' (Planar YUV 4:2:0)
[8]: '422P' (Planar YUV 4:2:2)
[9]: 'NV16' (Y/CbCr 4:2:2)
[10]: 'NM16' (Y/CbCr 4:2:2 (N-C))
[11]: 'YUYV' (YUYV 4:2:2)
[12]: 'UYVY' (UYVY 4:2:2)
[13]: 'YVYU' (YVYU 4:2:2)
[14]: 'VYUY' (VYUY 4:2:2)
[15]: 'BA24' (32-bit ARGB 8-8-8-8)
[16]: 'AR24' (32-bit BGRA 8-8-8-8)
[17]: 'BX24' (32-bit XRGB 8-8-8-8)
[18]: 'XR24' (32-bit BGRX 8-8-8-8)
[19]: 'RGBP' (16-bit RGB 5-6-5)
[20]: 'RGB3' (24-bit RGB 8-8-8)
[21]: 'BGR3' (24-bit BGR 8-8-8)
The v4l2convert
plug-in in GStreamer framework conveniently wraps the format conversion and resizing capabilities of MDP.
VPUD Daemon
Although the video devices are accessible through V4L2 interfaces, the hardware video processing kernel driver delegates most of the hardware configuration logic to userspace daemons. This daemon is:
vpud
serves the video encoder and decoder drivers.
The daemon does not provide interfaces to other userspace clients. It only works with the kernel driver. All the video processing functionalities should be accessed through the V4L2 interface on IoT Yocto.
Therefore, the video processing drivers stop working if the vpud
process is not initialized or stopped.
On IoT Yocto, the daemon is launched during the system boot process.
Video Encoder Extra-Controls
As a V4L2 video encoder, mtk-vcodec-enc
also provides extra-controls to set encoder capabilities.
CID |
Command(String) |
Value |
Default Value |
Note |
---|---|---|---|---|
V4L2_CID_MPEG_VIDEO_BITRATE |
|
1~20000000 |
20000000 |
|
V4L2_CID_MPEG_VIDEO_GOP_SIZE |
|
0~65535 |
0 |
size 0 means I-VOP only |
V4L2_CID_MPEG_VIDEO_FORCE_KEY_FRAME |
|
0~0 |
0 |
to force set I-VOP on the next output frame |
V4L2_CID_MPEG_VIDEO_HEADER_MODE |
|
0~1 |
1 |
0: separate mode, 1: joined-with-1st-frame mode. |
V4L2_CID_MPEG_VIDEO_H264_PROFILE |
|
0, 2, 4 |
4 |
0: BASELINE, 2: MAIN, 4: HIGH |
V4L2_CID_MPEG_VIDEO_H264_LEVEL |
|
0, 2~13 |
11 |
support LEVEL_1_0~LEVEL_4_2, exclude LEVEL_1B) |
Note
GStreamer is not fully support video header mode V4L2_MPEG_VIDEO_HEADER_MODE_SEPARATE.
For example, to compress a H.264 main profile and level 4.1 video bitstream with 512kbps bitrate:
gst-launch-1.0 -v videotestsrc num-buffers=300 ! "video/x-raw,format=NV12, width=720, height=480, framerate=30/1" ! v4l2h264enc extra-controls="cid,video_gop_size=30,video_bitrate=512000,sequence_header_mode=1" ! "video/x-h264,level=(string)4.1,profile=main" ! h264parse ! mp4mux ! filesink location=/tmp/test-h264.mp4
...
Execution ended after 0:00:01.554987154
Setting pipeline to NULL ...
Freeing pipeline ...
Note
To modify profile and level, please set it via gst-caps. If set by extra-controls directly, The profile & level will be overridden during gst caps negotiation.
Performance Measurement
Software vs. Hardware Decoder and Converter
On IoT Yocto, it provides VCODEC and MDP, which are hardware components to accelerate the video pipeline.
You can still use software components to process the video, but the performance may be terrible due to the CPU performance.
In this section, there are software and hardware samples for you to compare the framerate influence.
The scenario is to decode a 720P/30FPS video, convert it to a 1080P/30FPS video, and then show it on the screen.
fpsdisplaysink
is used to calculate the framerate.
To use the software method:
gst-launch-1.0 -v filesrc location=<your-video-path> ! parsebin ! avdec_h264 ! \
videoscale ! video/x-raw,width=1920,height=1080 ! fpsdisplaysink video-sink=waylandsink text-overlay=false
...
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0/GstTextOverlay:fps-display-text-overlay: text = rendered: 181, dropped: 0, current: 18.74, average: 18.92
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 181, dropped: 0, current: 18.74, average: 18.92
To use the hardware method:
gst-launch-1.0 -v filesrc location=<your-video-path> ! parsebin ! v4l2h264dec ! \
v4l2convert output-io-mode=5 ! video/x-raw,width=1920,height=1080 ! fpsdisplaysink video-sink=waylandsink text-overlay=false
...
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0/GstTextOverlay:fps-display-text-overlay: text = rendered: 268, dropped: 1, current: 27.95, average: 27.80
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 268, dropped: 1, current: 27.95, average: 27.80
The average framerate of the software method is about 18.92 FPS while the average framerate of the hardware method is 27.80 FPS. The heavier loading it takes, the more difference it creates.
Note
When using fpsdisplaysink
to check performance, please add ‘text-overlay=false’ to prevent drawing FPS information on the display overlay.
It might cost a lot of CPU computing power.
GStreamer Pipeline for Performance Test
Note
The following test results are based on Genio 1200-EVK.
For the decoder performance test, fpsdisplaysink
is used to show FPS information, and waylandsink
is assigned as the video-sink to check the quality of the screen.
An example of the performance test on a 4K60fps H264 video playback:
gst-launch-1.0 -v filesrc location=H264_3840x2160_60fps.mp4 ! parsebin ! queue ! v4l2h264dec ! queue ! \
v4l2convert output-io-mode=dmabuf-import capture-io-mode=dmabuf ! queue ! fpsdisplaysink video-sink=waylandsink text-overlay=false
...
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0/GstWaylandSink:waylandsink0: sync = true
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 32, dropped: 0, current: 63.65, average: 63.65
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 62, dropped: 0, current: 58.81, average: 61.21
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 93, dropped: 0, current: 61.20, average: 61.21
Note
The GStreamer element queue
is added to the pipeline to remove the buffer dependency between elements.
An example of the performance test on a 4K60fps H265 video playback:
gst-launch-1.0 -v filesrc location=H265_3840x2160_60fps.mp4 ! parsebin ! queue ! v4l2h265dec ! queue ! \
v4l2convert output-io-mode=dmabuf-import capture-io-mode=dmabuf ! queue ! fpsdisplaysink video-sink=waylandsink text-overlay=false
...
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0/GstWaylandSink:waylandsink0: sync = true
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 32, dropped: 0, current: 63.70, average: 63.70
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 62, dropped: 0, current: 59.99, average: 61.85
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 93, dropped: 0, current: 60.00, average: 61.22
An example of the performance test on a FHD60fps MPEG4 video playback:
gst-launch-1.0 -v filesrc location=MPEG4_1920x1080_60fps.mp4 ! parsebin ! queue ! v4l2mpeg4dec ! queue ! \
v4l2convert output-io-mode=dmabuf-import capture-io-mode=dmabuf ! queue ! fpsdisplaysink video-sink=waylandsink text-overlay=false
...
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0/GstWaylandSink:waylandsink0: sync = true
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 32, dropped: 0, current: 63.75, average: 63.75
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 63, dropped: 0, current: 60.00, average: 61.85
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 94, dropped: 0, current: 59.99, average: 61.23
For the encoder performance test, fpsdisplaysink
is used to show FPS information, and fakesink
is assigned as the video-sink to remove the overhead of the file writer.
In the following test, we simply use the decoded video frames as the encoder input sources.
Note
There is a hardware limitation that the input frame buffer MUST align to 16x16 (the buffer width is the multiple of 16 and the buffer height is the multiple of 16)
An example of the performance test on a 4K60fps H264 video encoding:
gst-launch-1.0 -v filesrc location=H264_3840x2160_60fps.mp4 ! parsebin ! queue ! v4l2h264dec ! queue ! \
v4l2convert output-io-mode=dmabuf-import capture-io-mode=dmabuf ! queue ! v4l2h264enc output-io-mode=dmabuf-import ! queue ! \
fpsdisplaysink video-sink=fakesink text-overlay=false
...
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0/GstFakeSink:fakesink0: sync = true
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 32, dropped: 0, current: 63.93, average: 63.93
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 63, dropped: 0, current: 60.00, average: 61.93
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 93, dropped: 0, current: 60.00, average: 61.30
Note
The buffer type output-io-mode=dmabuf-import
was assigned to v4l2h264enc to prevent the buffer copy of the input source.
An example of the performance test on a 4K60fps H265 video encoding:
gst-launch-1.0 -v filesrc location=H264_3840x2160_60fps.mp4 ! parsebin ! queue ! v4l2h264dec ! queue ! \
v4l2convert output-io-mode=dmabuf-import capture-io-mode=dmabuf ! queue ! v4l2h265enc output-io-mode=dmabuf-import ! queue ! \
fpsdisplaysink video-sink=fakesink text-overlay=false
...
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0/GstFakeSink:fakesink0: sync = true
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 32, dropped: 0, current: 63.88, average: 63.88
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 63, dropped: 0, current: 60.04, average: 61.93
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 93, dropped: 0, current: 60.00, average: 61.29
An example of the performance test on a FHD120fps H264 video encoding:
gst-launch-1.0 -v filesrc location=H264_1920x1080_120fps.mp4 ! parsebin ! queue ! v4l2h264dec ! queue ! \
v4l2convert output-io-mode=dmabuf-import capture-io-mode=dmabuf ! video/x-raw,width=1920,height=1088 ! queue ! \
v4l2h264enc output-io-mode=dmabuf-import ! queue ! fpsdisplaysink video-sink=fakesink text-overlay=false
...
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0/GstFakeSink:fakesink0: sync = true
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 62, dropped: 0, current: 123.86, average: 123.86
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 123, dropped: 0, current: 120.06, average: 121.94
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 183, dropped: 0, current: 119.99, average: 121.30
Note
In the test, the FHD video frames outputted by the v4l2h264dec
was converted to 1920x1088 for v4l2h264enc
due to the encoder hardware limitation (16x16 alignment).
It can be removed if the frame buffer size is already 16x16 alignment.
FAQ
Is there any low level library that can be used to control video HW encoder & decoder?
Why the bitrate setting set to the encoder is not exactly matched in the output file?
Is there any low level library that can be used to control video HW encoder & decoder?
We only support video encode/decode via V4L2 framework.
You can try to include GStreamer library in your application to control V4L2 framework.
Please refer to the GStreamer hello-world example or
simply use gst_parse_launch to parse the gst-launch
commands.
Why the profile and level settings set to the H264 encoder is not exactly matched in the output file?
For the profile, the H264 encoder will decide how many features (such as CABAC, and 4x4 transform) will be used on the current input video. The profile setting will be changed according to the applied features.
For the level, the H264 encoder will output bitstream with the correct level according to the resolution and level defined in ISO/IEC 14496-10 – MPEG-4 Part 10, Advanced Video Coding. e.g. level 5.1 to support 4K video.
Please refer to Advanced Video Coding for the profile and level definition.
Why the bitrate setting set to the encoder is not exactly matched in the output file?
Higher motion / more details content requires a higher bitrate to achieve the same perceived quality video stream. For example, a sporting event or concert with high motion and many moving cameras will typically require a significantly higher bitrate at the same resolution to have the same perceived quality.
Higher resolutions require a higher bitrate to achieve the same perceived quality video stream.
It is important that appropriately adjust your bitrate to the resolution you are using. Using too high or low of a bitrate can lead to poor image quality and might not meet the bitrate target.
e..g For a 4K video, the suggested bitrate setting is 10~40 Mbps on YouTube