Model Hub

This page provides a list of commonly used TensorFlow Lite models and their converted DLAs for implementation. Each model archive here is given tflite, mdla2 and mdla3 file for inference. Users can only run the version-specific DLA model on the supporting platform. Here is the lookup table.

Platform	MDLA Version
Genio 350	N/A
Genio 510	MDLA 3.0
Genio 700	MDLA 3.0
Genio 1200	MDLA 2.0

Note

The inference speed statistics shown here were measured with performance mode enabled across different Genio products (G510, G700, and G1200 NPUs) using various models and data types. The inference times were obtained using Neuron SDK, and more detailed information can be found on each model’s detail page linked in the table. The performance may vary depending on the platform and specific hardware used. To obtain the most accurate statistics, you should run the application directly on the targeted platform. Performance may also differ between different versions of the board image and EVK.

Supported Models on Genio Product

Models for Detection
Task	Model Name	Source model type	Data Type	Input Size	G510 APU Inference Time(ms)	G700 APU Inference Time(ms)	G1200 APU Inference Time(ms)	Detail
Object Detection	YOLOv5s	.pt	Quant8	640x640	17.47 ms	10.04 ms	19.05 ms	Link
Object Detection	YOLOv5s	.pt	Float32	640x640	46.41 ms	32.04 ms	36.66 ms	Link
Object Detection	YOLOv8s	.pt	Quant8	640x640	25.51 ms	17.01 ms	28.04 ms	Link
Object Detection	YOLOv8s	.pt	Float32	640x640	70.95 ms	50.04 ms	55.84 ms	Link
Object Detection	YOLOXs	.pt	Quant8	640x640	22.31 ms	15.04 ms	25.05 ms	Link
Object Detection	YOLOXs	.pt	Float32	640x640	62.12 ms	44.04 ms	48.05 ms	Link

Models for Classification
Task	Model Name	Source model type	Data Type	Input Size	G510 APU Inference Time(ms)	G700 APU Inference Time(ms)	G1200 APU Inference Time(ms)	Detail
Classification	ConvNeXt	.pt	Quant8	224x224	55.03 ms	38.22 ms	N/A	Link
Classification	ConvNeXt	.pt	Float32	224x224	153.38 ms	112.03 ms	N/A	Link
Classification	DenseNet	.pt	Quant8	224x224	7.03 ms	5.03 ms	6.04 ms	Link
Classification	DenseNet	.pt	Float32	224x224	16.51 ms	11.04 ms	12.04 ms	Link
Classification	EfficientNet	.pt	Quant8	224x224	4.05 ms	3.00 ms	3.05 ms	Link
Classification	EfficientNet	.pt	Float32	224x224	9.03 ms	6.04 ms	6.05 ms	Link
Classification	GoogLeNet	.pt	Quant8	224x224	3.04 ms	2.04 ms	2.39 ms	Link
Classification	GoogLeNet	.pt	Float32	224x224	8.4 ms	6.04 ms	6.05 ms	Link
Classification	InceptionV3	.pt	Quant8	224x224	5.59 ms	3.04 ms	5.04 ms	Link
Classification	InceptionV3	.pt	Float32	224x224	17.68 ms	12.04 ms	11.04 ms	Link
Classification	MobileNetV1	.tflite	Quant8	224x224	1.28 ms	1.04 ms	1.05 ms	Not Supported
Classification	MobileNetV2	.pt	Quant8	224x224	1.37 ms	1.04 ms	1.04 ms	Link
Classification	MobileNetV2	.pt	Float32	224x224	3.57 ms	2.04 ms	2.58 ms	Link
Classification	MobileNetV3	.pt	Quant8	224x224	1.04 ms	0.04 ms	N/A	Link
Classification	MobileNetV3	.pt	Float32	224x224	2.72 ms	1.05 ms	2.05 ms	Link
Classification	ResNet	.pt	Quant8	224x224	2.79 ms	2.03 ms	2.05 ms	Link
Classification	ResNet	.pt	Float32	224x224	9.21 ms	6.04 ms	8.04 ms	Link
Classification	SqueezeNet	.pt	Quant8	224x224	1.52 ms	1.04 ms	1.05 ms	Link
Classification	SqueezeNet	.pt	Float32	224x224	5.01 ms	3.04 ms	3.05 ms	Link
Classification	VGG	.pt	Quant8	224x224	24.85 ms	17.04 ms	24.04 ms	Link
Classification	VGG	.pt	Float32	224x224	80.3 ms	56.04 ms	49.05 ms	Link

Models for Recognition
Task	Model Name	Source model type	Data Type	Input Size	G510 APU Inference Time(ms)	G700 APU Inference Time(ms)	G1200 APU Inference Time(ms)	Detail
Recognition	ShuffleNetV2	.pt	Quant8	224x224	N/A	N/A	N/A	Link
Recognition	ShuffleNetV2	.pt	Float32	224x224	N/A	N/A	N/A	Link
Recognition	VGGFace	.pt	Quant8	224x224	25.04 ms	17.04 ms	24.04 ms	Link
Recognition	VGGFace	.pt	Float32	224x224	81.7 ms	56.04 ms	49.05 ms	Link