Performance Evaluation

In this section, we provide the performance statistics with different combinations of delegate and buffer conversion methods for : Object Detection which is introduced above.

../../_images/tools_ai-demo-app_object-detection-pipeline.png

Designed Pipeline for Object Detection

Note

The statistics are roughly recorded on experimental. To obtain exact statistics, you should run the APP on the platform. Performance may vary between different versions of the board image.

Genio 350

The measurement data below was tested in performance mode. For information about how to set the performance mode on Genio-350, please refer to Genio-350 Performance Mode.

  • Apply different conversion methods for the pipeline. This table only shows the statistics for applying NNAPI delegate, because it has the best performance of all based on the experimental results.

USB Camera (resolution 1920*1080)

YUV camera (resolution 1920*1080)

Conversion Method

FPS

Inference Time(ms)

FPS

Inference Time(ms)

v4l2convert + v4l2convert

14

34

19

30

videoconvert + v4l2convert

14

34

4

33

v4l2convert + videoscale

5

34

5

31

videoconvert + videoscale

16

32

3

31

  • Apply different delegates for the pipeline with the best conversion methods. For USB Camera, is videoconvert + videoscale, and for YUV camera, is v4l2convert + v4l2convert.

Average inference time(ms)

Object Detection

USB camera (640*480)

YUV camera (640*480)

Delegate

FPS

Inference Time(ms)

FPS

Inference Time(ms)

CPU

4

254

4

255

GPU

6

174

5

172

ArmNN(GpcAcc)

12

82

9

75

ArmNN(CpcAcc)

11

90

9

75

NNAPI(VPU)

24

34

14

33

Image Classification

USB camera (640*480)

YUV camera (640*480)

Delegate

FPS

Inference Time(ms)

FPS

Inference Time(ms)

CPU

8

137

7

131

GPU

12

84

10

82

ArmNN(GpcAcc)

21

43

15

42

ArmNN(CpcAcc)

18

50

16

41

NNAPI(VPU)

31

19

23

18

Genio 1200

The measurement data below was tested in performance mode. For information about how to set the performance mode on Genio-1200, please refer to Genio-1200 Performance Mode.

  • Apply different conversion methods for the pipeline. This table only shows the statistics for applying ARMNN(CpuAcc), because it has the best performance of all based on the experimental results.

USB Camera (640*480)

Convert Method

FPS

Inference Time(ms)

v4l2convert + v4l2convert

Unsupported yet

videoconvert + v4l2convert

v4l2convert + videoscale

videoconvert + videoscale

30

30

  • Apply different delegates for the pipeline with the best conversion methods. For USB Camera, is videoconvert + videoscale.

Average inference time(ms)

Object Detection

USB camera (640*480)

Delegate

FPS

Inference Time(ms)

CPU

31

29

GPU

26

33

ArmNN(GpcAcc)

18

18

ArmNN(CpcAcc)

30

30

Neuron(MDLA)

31

8

Image Classification

USB camera (640*480)

Delegate

FPS

Inference Time(ms)

CPU

31

16

GPU

31

15

ArmNN(GpcAcc)

31

12

ArmNN(CpcAcc)

31

17

Neuron(MDLA)

31

3

Note

  • For how to convert tflite model to dla model, please refer to Neuron Compiler section.

  • YUV camera: Unsupported yet

Genio 700

The measurement data below was tested in performance mode. For information about how to set the performance mode on Genio-700, please refer to Genio-700 Performance Mode.

  • Apply different conversion methods for the pipeline. This table only shows the statistics for applying ARMNN(CpuAcc), because it has the best performance of all based on the experimental results.

USB Camera (640*480)

Convert Method

FPS

Inference Time(ms)

v4l2convert + v4l2convert

Unsupported yet

videoconvert + v4l2convert

v4l2convert + videoscale

videoconvert + videoscale

31

20

  • Apply different delegates for the pipeline with the best conversion methods. For USB Camera, is videoconvert + videoscale.

Average inference time(ms)

Object Detection

USB camera (640*480)

Delegate

FPS

Inference Time(ms)

CPU

30

31

GPU

19

47

ArmNN(GpcAcc)

28

29

ArmNN(CpcAcc)

31

20

Neuron(MDLA)

31

9

Image Classification

USB camera (640*480)

Delegate

FPS

Inference Time(ms)

CPU

31

17

GPU

31

23

ArmNN(GpcAcc)

31

13

ArmNN(CpcAcc)

31

13

Neuron(MDLA)

31

4

Note

  • For how to convert tflite model to dla model, please refer to Neuron Compiler section.

  • YUV camera: Unsupported yet