Model Hub

This page provides a list of commonly used TensorFlow Lite models and their converted DLAs for implementation. Each model archive here is given tflite, mdla2 and mdla3 file for inference. Users can only run the version-specific DLA model on the supporting platform. Here is the lookup table.

Platform

MDLA Version

Genio 350

N/A

Genio 510

MDLA 3.0

Genio 700

MDLA 3.0

Genio 1200

MDLA 2.0

Note

The inference speed statistics shown here were measured with performance mode enabled across different Genio products (G510, G700, and G1200 NPUs) using various models and data types. The inference times were obtained using Neuron SDK, and more detailed information can be found on each model’s detail page linked in the table. The performance may vary depending on the platform and specific hardware used. To obtain the most accurate statistics, you should run the application directly on the targeted platform. Performance may also differ between different versions of the board image and EVK.

Supported Models on Genio Product

Task

Model Name

Source model type

Data Type

Input Size

G510 APU Inference Time(ms)

G700 APU Inference Time(ms)

G1200 APU Inference Time(ms)

Detail

Object Detection

YOLOv5s

.pt

Quant8

640x640

17.47 ms

10.04 ms

19.05 ms

Link

Object Detection

YOLOv5s

.pt

Float32

640x640

46.41 ms

32.04 ms

36.66 ms

Link

Object Detection

YOLOv8s

.pt

Quant8

640x640

25.51 ms

17.01 ms

28.04 ms

Link

Object Detection

YOLOv8s

.pt

Float32

640x640

70.95 ms

50.04 ms

55.84 ms

Link

Object Detection

YOLOXs

.pt

Quant8

640x640

22.31 ms

15.04 ms

25.05 ms

Link

Object Detection

YOLOXs

.pt

Float32

640x640

62.12 ms

44.04 ms

48.05 ms

Link

Classification

ConvNeXt

.pt

Quant8

224x224

55.03 ms

38.22 ms

N/A

Link

Classification

ConvNeXt

.pt

Float32

224x224

153.38 ms

112.03 ms

N/A

Link

Classification

DenseNet

.pt

Quant8

224x224

7.03 ms

5.03 ms

6.04 ms

Link

Classification

DenseNet

.pt

Float32

224x224

16.51 ms

11.04 ms

12.04 ms

Link

Classification

EfficientNet

.pt

Quant8

224x224

4.05 ms

3.00 ms

3.05 ms

Link

Classification

EfficientNet

.pt

Float32

224x224

9.03 ms

6.04 ms

6.05 ms

Link

Classification

GoogLeNet

.pt

Quant8

224x224

3.04 ms

2.04 ms

2.39 ms

Link

Classification

GoogLeNet

.pt

Float32

224x224

8.4 ms

6.04 ms

6.05 ms

Link

Classification

InceptionV3

.pt

Quant8

224x224

5.59 ms

3.04 ms

5.04 ms

Link

Classification

InceptionV3

.pt

Float32

224x224

17.68 ms

12.04 ms

11.04 ms

Link

Classification

MobileNetV1

.tflite

Quant8

224x224

1.28 ms

1.04 ms

1.05 ms

Not Supported

Classification

MobileNetV2

.pt

Quant8

224x224

1.37 ms

1.04 ms

1.04 ms

Link

Classification

MobileNetV2

.pt

Float32

224x224

3.57 ms

2.04 ms

2.58 ms

Link

Classification

MobileNetV3

.pt

Quant8

224x224

1.04 ms

0.04 ms

N/A

Link

Classification

MobileNetV3

.pt

Float32

224x224

2.72 ms

1.05 ms

2.05 ms

Link

Classification

ResNet

.pt

Quant8

224x224

2.79 ms

2.03 ms

2.05 ms

Link

Classification

ResNet

.pt

Float32

224x224

9.21 ms

6.04 ms

8.04 ms

Link

Classification

SqueezeNet

.pt

Quant8

224x224

1.52 ms

1.04 ms

1.05 ms

Link

Classification

SqueezeNet

.pt

Float32

224x224

5.01 ms

3.04 ms

3.05 ms

Link

Recognition

ShuffleNetV2

.pt

Quant8

224x224

N/A

N/A

N/A

Link

Recognition

ShuffleNetV2

.pt

Float32

224x224

N/A

N/A

N/A

Link

Classification

VGG

.pt

Quant8

224x224

24.85 ms

17.04 ms

24.04 ms

Link

Classification

VGG

.pt

Float32

224x224

80.3 ms

56.04 ms

49.05 ms

Link

Recognition

VGGFace

.pt

Quant8

224x224

25.04 ms

17.04 ms

24.04 ms

Link

Recognition

VGGFace

.pt

Float32

224x224

81.7 ms

56.04 ms

49.05 ms

Link