Model Hub
This page provides a list of commonly used TensorFlow Lite models and their converted DLAs for implementation. Each model archive here is given tflite, mdla2 and mdla3 file for inference. Users can only run the version-specific DLA model on the supporting platform. Here is the lookup table.
Platform |
MDLA Version |
---|---|
Genio 350 |
N/A |
Genio 510 |
MDLA 3.0 |
Genio 700 |
MDLA 3.0 |
Genio 1200 |
MDLA 2.0 |
Note
The inference speed statistics shown here were measured with performance mode enabled across different Genio products (G510, G700, and G1200 NPUs) using various models and data types. The inference times were obtained using Neuron SDK, and more detailed information can be found on each model’s detail page linked in the table. The performance may vary depending on the platform and specific hardware used. To obtain the most accurate statistics, you should run the application directly on the targeted platform. Performance may also differ between different versions of the board image and EVK.
Supported Models on Genio Product
Task |
Model Name |
Source model type |
Data Type |
Input Size |
G510 APU Inference Time(ms) |
G700 APU Inference Time(ms) |
G1200 APU Inference Time(ms) |
Detail |
Object Detection |
YOLOv5s |
.pt |
Quant8 |
640x640 |
17.47 ms |
10.04 ms |
19.05 ms |
|
Object Detection |
YOLOv5s |
.pt |
Float32 |
640x640 |
46.41 ms |
32.04 ms |
36.66 ms |
|
Object Detection |
YOLOv8s |
.pt |
Quant8 |
640x640 |
25.51 ms |
17.01 ms |
28.04 ms |
|
Object Detection |
YOLOv8s |
.pt |
Float32 |
640x640 |
70.95 ms |
50.04 ms |
55.84 ms |
|
Object Detection |
YOLOXs |
.pt |
Quant8 |
640x640 |
22.31 ms |
15.04 ms |
25.05 ms |
|
Object Detection |
YOLOXs |
.pt |
Float32 |
640x640 |
62.12 ms |
44.04 ms |
48.05 ms |
Task |
Model Name |
Source model type |
Data Type |
Input Size |
G510 APU Inference Time(ms) |
G700 APU Inference Time(ms) |
G1200 APU Inference Time(ms) |
Detail |
Classification |
ConvNeXt |
.pt |
Quant8 |
224x224 |
55.03 ms |
38.22 ms |
N/A |
|
Classification |
ConvNeXt |
.pt |
Float32 |
224x224 |
153.38 ms |
112.03 ms |
N/A |
|
Classification |
DenseNet |
.pt |
Quant8 |
224x224 |
7.03 ms |
5.03 ms |
6.04 ms |
|
Classification |
DenseNet |
.pt |
Float32 |
224x224 |
16.51 ms |
11.04 ms |
12.04 ms |
|
Classification |
EfficientNet |
.pt |
Quant8 |
224x224 |
4.05 ms |
3.00 ms |
3.05 ms |
|
Classification |
EfficientNet |
.pt |
Float32 |
224x224 |
9.03 ms |
6.04 ms |
6.05 ms |
|
Classification |
GoogLeNet |
.pt |
Quant8 |
224x224 |
3.04 ms |
2.04 ms |
2.39 ms |
|
Classification |
GoogLeNet |
.pt |
Float32 |
224x224 |
8.4 ms |
6.04 ms |
6.05 ms |
|
Classification |
InceptionV3 |
.pt |
Quant8 |
224x224 |
5.59 ms |
3.04 ms |
5.04 ms |
|
Classification |
InceptionV3 |
.pt |
Float32 |
224x224 |
17.68 ms |
12.04 ms |
11.04 ms |
|
Classification |
MobileNetV1 |
.tflite |
Quant8 |
224x224 |
1.28 ms |
1.04 ms |
1.05 ms |
Not Supported |
Classification |
MobileNetV2 |
.pt |
Quant8 |
224x224 |
1.37 ms |
1.04 ms |
1.04 ms |
|
Classification |
MobileNetV2 |
.pt |
Float32 |
224x224 |
3.57 ms |
2.04 ms |
2.58 ms |
|
Classification |
MobileNetV3 |
.pt |
Quant8 |
224x224 |
1.04 ms |
0.04 ms |
N/A |
|
Classification |
MobileNetV3 |
.pt |
Float32 |
224x224 |
2.72 ms |
1.05 ms |
2.05 ms |
|
Classification |
ResNet |
.pt |
Quant8 |
224x224 |
2.79 ms |
2.03 ms |
2.05 ms |
|
Classification |
ResNet |
.pt |
Float32 |
224x224 |
9.21 ms |
6.04 ms |
8.04 ms |
|
Classification |
SqueezeNet |
.pt |
Quant8 |
224x224 |
1.52 ms |
1.04 ms |
1.05 ms |
|
Classification |
SqueezeNet |
.pt |
Float32 |
224x224 |
5.01 ms |
3.04 ms |
3.05 ms |
|
Classification |
VGG |
.pt |
Quant8 |
224x224 |
24.85 ms |
17.04 ms |
24.04 ms |
|
Classification |
VGG |
.pt |
Float32 |
224x224 |
80.3 ms |
56.04 ms |
49.05 ms |
Task |
Model Name |
Source model type |
Data Type |
Input Size |
G510 APU Inference Time(ms) |
G700 APU Inference Time(ms) |
G1200 APU Inference Time(ms) |
Detail |
Recognition |
ShuffleNetV2 |
.pt |
Quant8 |
224x224 |
N/A |
N/A |
N/A |
|
Recognition |
ShuffleNetV2 |
.pt |
Float32 |
224x224 |
N/A |
N/A |
N/A |
|
Recognition |
VGGFace |
.pt |
Quant8 |
224x224 |
25.04 ms |
17.04 ms |
24.04 ms |
|
Recognition |
VGGFace |
.pt |
Float32 |
224x224 |
81.7 ms |
56.04 ms |
49.05 ms |