ONNX Runtime - Analytical AI
Important
ONNX Runtime support for Genio platforms is under active development. MediaTek will provide feature enhancements and expanded model coverage on a quarterly basis.
The resources in this section support the Online inference path for ONNX Runtime-analytical AI workloads. This section provides the necessary onnx models required to execute via the ONNX Runtime.
ONNX Runtime is available on MediaTek Genio platforms to accelerate ONNX models with hardware support.
High Performance: Utilize the Neuron Execution Provider (EP) on Genio 520 and 720 for NPU acceleration.
Broad Compatibility: Standard CPU EP execution is available across all Genio platforms.
FP16 Focus: FP16 models have the most complete hardware-acceleration support.
Quantized QDQ: QDQ models have partially supported with NPU acceleration.
For best performance and compatibility, deploy FP16 ONNX models whenever possible.
Performance Comparison
MediaTek provides a comprehensive matrix of performance data across different Genio platforms.
To view a quick summary of high-end platforms (G520/G720), see the tables below.
To compare performance across all platforms (G350, G510, G700, G1200), refer to the dedicated page:
Supported Models on Genio Products
The following tables list ONNX models that have been validated on Genio platforms. The current list contains 45 models, grouped into three model families:
TAO Related
Legacy Analytical
Robotic
If your model is not listed in the tables below, you are still welcome to try it. For questions or issues, post on the Genio Community forum.
Note
The performance statistics shown in these tables were measured using NPU Execution Provider with performance mode enabled across different Genio products, models, and data types.
Legacy Analytical Models
Legacy analytical models are classic vision backbones and networks widely used only for benchmarking and reference. The accuracy of the model is not addressed.
Genio 520 |
Genio 720 |
|||||
Task |
Model Name |
Data Type |
Input Size |
Inference Time (ms) |
Inference Time (ms) |
Pretrained Model |
Object Detection |
YOLOv5s |
Quant8 |
640x640 |
83.512 |
||
Object Detection |
YOLOv5s |
Float32 |
640x640 |
35.344 |
||
Object Detection |
YOLOv8s |
Quant8 |
640x640 |
120.081 |
||
Object Detection |
YOLO11s |
Quant8 |
640x640 |
110.876 |
Genio 520 |
Genio 720 |
|||||
Task |
Model Name |
Data Type |
Input Size |
Inference Time (ms) |
Inference Time (ms) |
Pretrained Model |
Classification |
ConvNeXt |
Quant8 |
224x224 |
353.309 |
||
Classification |
ConvNeXt |
Float32 |
224x224 |
591.345 |
||
Classification |
DenseNet |
Quant8 |
224x224 |
54.058 |
||
Classification |
DenseNet |
Float32 |
224x224 |
13.917 |
||
Classification |
EfficientNet |
Quant8 |
224x224 |
13.456 |
||
Classification |
EfficientNet |
Float32 |
224x224 |
2.977 |
||
Classification |
MobileNetV2 |
Quant8 |
224x224 |
5.734 |
||
Classification |
MobileNetV2 |
Float32 |
224x224 |
1.909 |
||
Classification |
MobileNetV3 |
Quant8 |
224x224 |
4.176 |
||
Classification |
MobileNetV3 |
Float32 |
224x224 |
9.393 |
||
Classification |
ResNet |
Quant8 |
224x224 |
17.595 |
||
Classification |
ResNet |
Float32 |
224x224 |
3.659 |
||
Classification |
SqueezeNet |
Quant8 |
224x224 |
17.676 |
||
Classification |
SqueezeNet |
Float32 |
224x224 |
15.313 |
||
Classification |
VGG |
Quant8 |
224x224 |
146.614 |
||
Classification |
VGG |
Float32 |
224x224 |
33.738 |
Genio 520 |
Genio 720 |
|||||
Task |
Model Name |
Data Type |
Input Size |
Inference Time (ms) |
Inference Time (ms) |
Pretrained Model |
Recognition |
VGGFace |
Quant8 |
224x224 |
146.057 |
||
Recognition |
VGGFace |
Float32 |
224x224 |
33.762 |
Robotic Models
Robotic models target robotic perception and control workloads, such as grasping, navigation, or policy learning. These models demonstrate the performance of ONNX Runtime on Genio platforms for specialized robotic tasks.
Genio 520 |
Genio 720 |
|||||
Task |
Model Name |
Data Type |
Input Size |
Inference Time (ms) |
Inference Time (ms) |
Pretrained Model |
Omni6DPose |
scale_policy |
Float32 |
0.029 |
|||
Dino |
dino_vitb8 |
Float32 |
59.66 |
|||
Diffusion Policy |
model_diffusion_sampling |
Float32 |
15.16 |
|||
MobileSam |
mobilesam_decoder |
Float32 |
12.2 |
|||
MobileSam |
mobilesam_encoder |
Float32 |
57.81 |
|||
RegionNormalizedGrasp |
anchornet |
Float32 |
25.46 |
|||
RegionNormalizedGrasp |
localnet |
Float32 |
6.23 |
|||
YoloWorld |
yoloworld_xl |
Float32 |
263.01 |
Performance Notes and Limitations
Note
The measurements were obtained using onnxruntime_perf_test, and each model’s performance can vary depending on:
The specific Genio platform and hardware configuration.
The version of the board image and evaluation kit (EVK).
The selected backend and model variant.
To obtain the most accurate performance numbers for your use case, you must run the application directly on the target platform.
Current Limitations
ONNX-GAI (generative AI) models are not officially supported at this time. Some proof-of-concept experiments exist internally, but they are not production-ready and are not part of the validated model list.
For the latest roadmap or early-access updates on ONNX-GAI support, please refer to the Genio Community forum.