AI Demo App

Overview

GstInference is an open-source project from Ridgerun Engineering that provides a framework for integrating deep learning inference into GStreamer. Either use one of the included elements to do out-of-the-box inference using the most popular deep learning architectures or leverage the base classes and utilities to support your custom architecture.

This repo uses R²Inference, an abstraction layer in C/C++ for a variety of machine learning frameworks. With R²Inference a single C/C++ application may work with models on different frameworks. This is useful to execute inference taking advantage of different hardware resources such as CPU, GPU, or AI optimized accelerators.

On i350-EVK, we provide TensorFlow lite with different hardware resources to develop a variety of machine learning applications.

../_images/tools_ai-demo-app_gstinference-i350-evk-software-stack.png

GstInference Software Stack on i350-EVK

On i1200-DEMO, user can inference model through online-compiled path Tensordlow lite or offline-compiled path Neuron.

../_images/tools_ai-demo-app_gstinference-i1200-demo-software-stack.png

GstInference Software Stack on i1200-DEMO

For more details on each platform, please refer to Machine Learning Developer Guide.

The following sections will describe how to get GstInference running on your platform and show the performance statistics for a different combination of

source, GStreamer pipeline, and hardware resource.

Building AI demo library

Note

Before downloading the Yocto layer, make sure you already have a populated Yocto building environment. Please refer to Build from Source Code for more information.

Please run following command to download the Yocto layer:

git clone https://gitlab.com/mediatek/aiot/rity/meta-mediatek-demo.git $PROJ_ROOT/src/meta-mediatek-demo

Setup Yocto building environment as described in Build from Source Code:

cd $PROJ_ROOT
export TEMPLATECONF=${PWD}/src/meta-rity/meta/conf/
source src/poky/oe-init-build-env
export BUILD_DIR=`pwd`

Run following command to add the layer we just downloaded to the environment:

bitbake-layers add-layer ../src/meta-mediatek-demo

Enable R2Inference and GstInference in your $PROJ_ROOT/build/conf/local.conf.

IMAGE_INSTALL:append = "r2inference gstinference"

Rebuild the image:

DISTRO=rity-demo MACHINE=i1200-demo bitbake rity-demo-image

Please refer to Flash Image to Boards to flash the target board with the built image.

Run the demo with built-in library

You can find the built-in examples in the /usr/share/gstinference_example directory. Here is the folder structure of gstinference_example:

.
├── image_classification
│   ├── labels.txt
│   ├── mobilenet_v1_1.0_224_quant.tflite
│   └── image_classification.sh
└── object_detection
    ├── coco_labels.txt
    ├── object_detection.sh
    └── ssd_mobilenet_v2_coco_quantized.tflite

To run the built-in application, please connect the camera to the EVK first. Please refer to Camera section for camera configuration.

Then simply run the shell script in the corresponding directory. Run the Demo section has the detail for parameter setting.

# under image_classification
./image_classification.sh

# under object_detection
./object_detection.sh

(Optional) Install Library with Application SDK

Instead of inferencing with default libraries, users can build the libraries with customized configurations in the application SDK. Follow the following steps to get GstInference running on your platform:

Install Application SDK

All the package installation should be finished in Application SDK. After installing the SDK and completing the required environment setup, you can start to build the GstInference with Tensorflow-Lite backend support.

Build Tensorflow-Lite with ExternalDelegate

R2Inference TensorFlow Lite backend depends on the C/C++ TensorFlow API. The installation process consists of downloading the source code, building and installing it.

  1. Download Tensorflow source code and the dependencies:

git clone -b v2.6.1 https://github.com/tensorflow/tensorflow
cd /PATH/TENSORFLOW/SRC/tensorflow/lite/tools/make     # Change /PATH/TENSORFLOW/SRC/ to your tensorflow path

# Download dependencies:
./download_dependencies.sh

2. Enable External Delegate on TensorflowLite v2.6.1 Default configuration in CMakeLists.txt does not include the ExternalDelegate source, add the following code snippet to enable it.

# Enter the tflite directory
cd /PATH/TENSORFLOW/SRC/tensorflow/lite/

# Add following code snippet to CMakeLists.txt
add_library(${TFLITE_DELEGATES_EXTERNAL_SRCS})
populate_tflite_source_vars("delegates/external"
TFLITE_DELEGATES_EXTERNAL_SRCS
    FILTER ".*(_test|_tester)\\.(cc|h)"
)

3. Configure for dependency package There are some missing dependencies required for TensorflowLite v2.6.1, but not included in the CMakeLists.txt

# Add following code snippet to inculde requied source
add_library(
${TFLITE_SOURCE_DIR}/tools/make/downloads/flatbuffers/src/util.cpp
${TFLITE_SOURCE_DIR}/tools/make/downloads/fft2d/fftsg.c
${TFLITE_SOURCE_DIR}/tools/make/downloads/fft2d/fftsg2d.c
${TFLITE_SOURCE_DIR}/tools/make/downloads/farmhash/src/farmhash.cc
)

set(TFLITE_GOOGLETEST_DIR "${TFLITE_SOURCE_DIR}/build/googletest/googletest/include")
set(TFLITE_GMOCK_DIR "${TFLITE_SOURCE_DIR}/build/googletest/googlemock/include/")

# Modify following code snippet to include defined path variable
set(TFLITE_INCLUDE_DIRS
  "${TENSORFLOW_SOURCE_DIR}"
  "${TFLITE_FLATBUFFERS_SCHEMA_DIR}"
  "${TFLITE_GOOGLETEST_DIR}"                                                   # include googletest
  "${TFLITE_GMOCK_DIR}"                                                        # include gmock
)

# Modify following code snippet to correct sourece path
populate_tflite_source_vars("experimental/ruy"                                 # Change to "build/ruy"
  TFLITE_EXPERIMENTAL_RUY_SRCS
  FILTER
  ".*(test(_fast|_slow|_special_specs|_overflow_dst_zero_point))\\.(cc|h)$"    # bypass some test scripts
  ".*(benchmark|tune_tool|example)\\.(cc|h)$"
)
populate_tflite_source_vars("experimental/ruy/profiler"                        # Change to "build/ruy/profiler"
  TFLITE_EXPERIMENTAL_RUY_PROFILER_SRCS
  FILTER ".*(test|test_instrumented_library)\\.(cc|h)$"
)

4. Cross-compile with CMake Please refer to Application SDK : Cross-compiling with CMake section to create a toolchain file. After creating the toolchain file. Run the following command to build Tensorflow-Lite. A static library libtensorflow-lite.a will be generated in the current directory.

cd /PATH/TENSORFLOW/SRC/tensorflow/lite/
mkdir build
cd build
cmake -DCMAKE_TOOLCHAIN_FILE=../cmake/toolchain-yocto-mtk.cmake -DCMAKE_BUILD_TYPE=Debug -DTFLITE_ENABLE_XNNPACK=OFF ..
cp libtensorflow-lite.a $SDKTARGETSYSROOT/lib64    # for compiling R²Inference

Build R²Inference

  1. Download R²Inference source code

git clone https://gitlab.com/mediatek/aiot/rity/r2inference.git
  1. Modify r2i-tflite build configuration to enable Tensorflow-lite backend

vim r2inference/r2i/tflite/meson.build
# Revice the following code snippet
include_directory : [configinc]

# to
include_directory : [configinc, '/PATH/TENSORFLOW/SRC', '/PATH/TENSORFLOW/SRC/tensorflow/lite/build/flatbuffers/include']
  1. Compile R²Inference

# Under r2inference/ directory
meson build -Denable-tflite=true -Denable-tests=disabled -Denable-docs=disabled
mkdir r2library
ninja -C build # Compile the project
DESTDIR=r2library ninja -C build install # Install the library

Note

R²Inference can also compile with tensorflow-lite shared library. Modify the configuration in r2inference/meson.build to include libtensorflowlite.so. tensorflow_lite = cpp.find_library('tensorflow-lite', required: true) #Change tensorflow-lite to YOUR_SHARE_LIBRARY_NAME, i.e. libtensorflowlite After modification, remember to rerun the commands in step 3 to build and install with correct library.

  1. Move the installed libraries and headers to the system library folder so that GstInference can build smoothly

# Under r2inference/ directory
cp r2library/usr/local/lib/libr2inference*  $SDKTARGETSYSROOT/usr/lib64
cp r2library/usr/local/lib/pkgconfig/r2inference-0.0.pc $SDKTARGETSYSROOT/usr/share/pkgconfig
mkdiir -p $SDKTARGETSYSROOT/usr/local/include
cp -r r2library/usr/local/include/r2inference-0.0  $SDKTARGETSYSROOT/usr/local/include

Build GstInference

  1. Download GstInference source code:

git clone https://gitlab.com/mediatek/aiot/rity/gst-inference.git
  1. Disable gtkdoc documentation which is not supported in Application SDK

vim gst-inference/docs/plugins/menson.build
#Revice the following code snippet

inference-plugin-1.0',
  main_sgml : '@0@/gst-inference-plugin-docs.sgml'.format(meson.current_build_dir()),
  src_dir : ['@0@/ext/'.format(meson.source_root()), meson.current_build_dir()],
  gobject_typesfile : 'gst-inference-plugin.types',
  #content_files : [version_entities],
  dependencies : [plugin_deps],
  install : true)   #Change to false
  1. Compile GstInference

meson build
mkdir gstlibrary
ninja -C build # Compile the project
DESTDIR=gstlibrary ninja -C build install # Install the libraries in gstlibrary

Install on EVK

  1. Copy necessary libraries generated in compiling R²Inference and GstInference to the library directory on target board

#In this case, we use USB flash drive to transfer file between Host and Target(i.e. i350-EVK, i1200-DEMO)
#Run following command on Host to package required libraries
mkdir -p necessary_lib/
cp  r2inference/r2library/usr/local/lib/(!pkgconfig) necessary_lib/   # pkgconfig file is unnecessary for target machine
cp -r gstinference/gstlibrary/usr/local/lib/(!pkgconfig) necessary_lib/
tar -czpvf necessary_lib.tgz necessary_lib/

Here is the folder structure of necessary_lib:

necessary_lib
├── gstreamer-1.0
│   ├── libgstinference.so
│   ├── libgstinferenceoverlayplugin.so
│   └── libgstinferenceutils.so
├── libgstinference-1.0.so
├── libgstinference-1.0.so.0
├── libgstinference-1.0.so.0
├── libgstinferencebaseoverlay-1.0.so.0
├── libr2inference-0.0.a
├── libr2inference-0.0.so
├── libr2inference-0.0.so.0
└── libr2inference-0.0.so.0.11.0

2. (Optional) Then, Copy files from Docker Container to the Local Machine If you install SDK in a Docker image, following steps give you a hint to transfer files from container to local

# In Local Machine
docker ps -a # to get docker container ID, in my case is 870615e417fe
docker cp 870615e417fe:/PATH_TO_NECESSARY_LIB/necessary_lib.tgz .

# plug in USB flash drive
lsblk # to get get device name, in my case is sda
mount /dev/sda1 /mnt
cp necessary_lib.tgz /mnt
# umount before you unplug USB
umount /dev/sda1
  1. Install the libraries on target machine(i.e. i350-EVK, i1200-DEMO)

# entr lib directory
cd /usr/lib64

# plug in USB flash drive
mount /dev/sda1 /mnt
cp /mnt/necessary_lib.tgz .
tar -zxvf necessary_lib.tgz .
mv necessary_lib/lib* .
mv necessary_lib/gstreamer-1.0/* ./gstreamer-1.0

Run the Demo

Following provides inference pipeline, you can change the configuration based on your needs.

  • Pipeline property

Property

Value

Description

CAMERA

/dev/videoX

Camera device node (X may vary on different devices)

API

i350-EVK: tflite

Available backends resources

i1200-DEMO: tflite, neuron

DELEGATE

i350-EVK: cpu, gpu, nnapi, armnn

Available hardware resources

i1200-DEMO: cpu, gpu, armnn

DELEGATE_OPTION

backends:CpuAcc,GpuAcc,CpuRef

Only take effet when DELEGAT=’armnn’, you can change the priority of the configuration

MODEL_LOCATION

i.e. ./ssd_mobilenet_v2_coco_quantized.tflite

Path to the model

LABELS

i.e. ./labels_coco.txt

Path to the inference labels

  • The device node which points to seninf is the camera. In this example, the camera is /dev/video3.

ls -l /sys/class/video4linux/
total 0
...
lrwxrwxrwx 1 root root 0 Sep 20  2020 video3 -> ../../devices/platform/soc/15040000.seninf/video4linux/video3
...
  • List supported formats

v4l2-ctl -d /dev/video3 --list-formats-ext
ioctl: VIDIOC_ENUM_FMT
Type: Video Capture

[0]: 'YUYV' (YUYV 4:2:2)
        Size: Discrete 640x480
                Interval: Discrete 0.033s (30.000 fps)
                Interval: Discrete 0.042s (24.000 fps)
                Interval: Discrete 0.050s (20.000 fps)
                Interval: Discrete 0.067s (15.000 fps)
                Interval: Discrete 0.100s (10.000 fps)
                Interval: Discrete 0.133s (7.500 fps)
                Interval: Discrete 0.200s (5.000 fps)
        Size: Discrete 160x90
                Interval: Discrete 0.033s (30.000 fps)
                Interval: Discrete 0.042s (24.000 fps)
                Interval: Discrete 0.050s (20.000 fps)
                Interval: Discrete 0.067s (15.000 fps)
                Interval: Discrete 0.100s (10.000 fps)
                Interval: Discrete 0.133s (7.500 fps)
                Interval: Discrete 0.200s (5.000 fps)
...
[1]: 'MJPG' (Motion-JPEG, compressed)
        Size: Discrete 640x480
                Interval: Discrete 0.033s (30.000 fps)
                Interval: Discrete 0.042s (24.000 fps)
                Interval: Discrete 0.050s (20.000 fps)
                Interval: Discrete 0.067s (15.000 fps)
                Interval: Discrete 0.100s (10.000 fps)
                Interval: Discrete 0.133s (7.500 fps)
                Interval: Discrete 0.200s (5.000 fps)
        Size: Discrete 160x90
                Interval: Discrete 0.033s (30.000 fps)
                Interval: Discrete 0.042s (24.000 fps)
                Interval: Discrete 0.050s (20.000 fps)
                Interval: Discrete 0.067s (15.000 fps)
                Interval: Discrete 0.100s (10.000 fps)
                Interval: Discrete 0.133s (7.500 fps)
                Interval: Discrete 0.200s (5.000 fps)
...

Important

Remember to adjust GPU frequency and CPU frequency to the max value so that the pipeline can achieve the best performance. Please find in GPU Performance Mode and CPU Frequency Scaling for the details.

Image Classification

  • You will need a v4l2 compatible camera (i.e. Logitech C922 PRO Stream in this case)

  • Pipeline

CAMERA=' /dev/video3 '
    API='tflite'
DELEGATE='armnn'
DELEGATE_OPTION='backends:CpuAcc,GpuAcc'
MODEL_LOCATION='mobilenet_v1_1.0_224_quant.tflite'
LABELS='labels.txt'
gst-launch-1.0 \
v4l2src device=$CAMERA ! "image/jpeg, width=1280, height=720,format=MJPG" ! jpegdec ! videoconvert ! tee name=t \
t. ! videoscale ! queue ! net.sink_model \
t. ! queue ! net.sink_bypass \
mobilenetv2 name=net delegate=$DELEGATE delegate-option=$DELEGATE_OPTION model-location=$MODEL_LOCATION api=$API labels="$(cat $LABELS)"  \
net.src_bypass ! inferenceoverlay style=0 font-scale=1 thickness=2 ! waylandsink sync=false
  • Output

../_images/tools_ai-demo-app_image-classification.png

Object Detection

  • You will need a v4l2 compatible camera (i.e. Logitech C922 PRO Stream in this case)

  • Pipeline

CAMERA=' /dev/video3 '
    API='tflite'
DELEGATE='cpu'
MODEL_LOCATION='mobilenet_ssd_pascal_quant.tflite'
LABELS='labels_pascal.txt'
gst-launch-1.0 -v \
v4l2src device=$CAMERA ! "image/jpeg, width=1280, height=720,format=MJPG" ! jpegdec ! videoconvert ! tee name=t \
t. ! videoscale ! queue ! net.sink_model \
t. ! queue ! net.sink_bypass \
mobilenetv2ssd name=net delegate=$DELEGATE delegate-option=$DELEGATE_OPTION model-location=$MODEL_LOCATION api=$API labels="$(cat $LABELS)" \
net.src_bypass ! inferenceoverlay style=0 font-scale=1 thickness=2 ! waylandsink sync=false
  • Output

../_images/tools_ai-demo-app_object-detection.png

Performance Evaluation

In this section, we provide the performance statistics with different combinations of delegate and buffer conversion methods for : Object Detection <object-detection> which is introduced above.

../_images/tools_ai-demo-app_object-detection-pipeline.png

Designed Pipeline for Object Detection

Note

The statistics are roughly recorded on experimental. To obtain exact statistics, you should run the APP on the platform. Performance may vary between different versions of the board image.

i350-EVK

  • Apply different conversion methods for the pipeline. This table only shows the statistics for applying NNAPI delegate, because it has the best performance of all based on the experimental results.

USB Camera (resolution 640*480)

YUV camera (resolution 2316*1746)

Conversion Method

FPS

Inference Time(ms)

FPS

Inference Time(ms)

v4l2convert + v4l2convert

14

34

14

30

videoconvert + v4l2convert

14

34

2

30

v4l2convert + videoscale

5

34

5

31

videoconvert + videoscale

25

32

2

31

  • Apply different delegates for the pipeline with the best conversion methods. For USB Camera, is videoconvert + videoscale, and for YUV camera, is v4l2convert + v4l2convert.

USB Camera (resolution 640*480)

YUV camera (resolution 2316*1746)

Delegate

FPS

Inference Time(ms)

FPS

Inference Time(ms)

CPU

5

205

5

200

GPU

7

145

6

180

ArmNN(GpuAcc)

10

100

7

100

ArmNN(CpuAcc)

12

76

4

250

NNAPI(VP6)

25

32

14

30

i1200-demo

API: Tensorflow Lite
  • Apply different conversion methods for the pipeline. This table only shows the statistics for applying ARMNN(CpuAcc), because it has the best performance of all based on the experimental results.

USB Camera (resolution 640*480)

YUV camera (resolution 2316*1746)

Convert Method

FPS

Inference Time(ms)

Unsupported yet

v4l2convert + v4l2convert

Unsupported yet

videoconvert + v4l2convert

v4l2convert + videoscale

videoconvert + videoscale

35

20

  • Apply different delegates for the pipeline with the best conversion methods. For USB Camera, is videoconvert + videoscale.

USB Camera (resolution 640*480)

YUV camera (resolution 2316*1746)

Delegate

FPS

Inference Time(ms)

Unsupported yet

CPU

8

125

GPU

30

31

ARMNN(GpuAcc)

31

25

ARMNN(CpuAcc)

35

20

API: Neuron
  • Apply different conversion methods for the pipeline. The model used in this case was compiled with backend:mdla.

USB Camera (resolution 640*480)

YUV camera (resolution 2316*1746)

Convert Method

FPS

Inference Time(ms)

Unsupported yet

v4l2convert + v4l2convert

Unsupported yet

videoconvert + v4l2convert

v4l2convert + videoscale

videoconvert + videoscale

5

30

Note

For how to convert tflite model to dla model, please refer to Neuron Compiler section.

Troubleshooting

Q: What are the values that can be set in inferenceoverlay property?

The inferenceoverlay element exposes the following properties to control the boxes’ thickness, color, etc. These properties are documented in the following table:

Property

Value

Description

font-scale

Double [0,100]

Scale of the font used on the overlay. 0 turns off the overlay

style

enum (0):classic, (1):dotted, (2):dashed

Line style of rectangle

thickness

Double [1,100]

Thickness in pixels used for the lines

Q: What to do to avoid “warning: #warning _FORTIFY_SOURCE requires compiling with optimization (-O) [-Wcpp]” when compiling r2inference & gst-inference?

Edit r2inference/buid/build.ninja or gst-inference/buid/build.ninja for which library you choose to build. Add -O flag After -D_FORTIFY_SOURCE=2 for c_COMPILER rule (r²inference only has this) and cpp_COMPILER rule.