ONNX Runtime - Analytical AI

Important

ONNX Runtime support for Genio platforms is under active development. MediaTek will provide feature enhancements and expanded model coverage on a quarterly basis.

The resources in this section support the Online inference path for ONNX Runtime-analytical AI workloads. This section provides the necessary onnx models required to execute via the ONNX Runtime.

Analytical AI inference paths: ONNX Runtime

ONNX Runtime is available on MediaTek Genio platforms to accelerate ONNX models with hardware support.

  • High Performance: Utilize the Neuron Execution Provider (EP) on Genio 520 and 720 for NPU acceleration.

  • Broad Compatibility: Standard CPU EP execution is available across all Genio platforms.

  • FP16 Focus: FP16 models have the most complete hardware-acceleration support.

  • Quantized QDQ: QDQ models have partially supported with NPU acceleration.

For best performance and compatibility, deploy FP16 ONNX models whenever possible.

Performance Comparison

MediaTek provides a comprehensive matrix of performance data across different Genio platforms.

  • To view a quick summary of high-end platforms (G520/G720), see the tables below.

  • To compare performance across all platforms (G350, G510, G700, G1200), refer to the dedicated page:

View Full Platform Performance Matrix

Supported Models on Genio Products

The following tables list ONNX models that have been validated on Genio platforms. The current list contains 45 models, grouped into three model families:

  • TAO Related

  • Legacy Analytical

  • Robotic

If your model is not listed in the tables below, you are still welcome to try it. For questions or issues, post on the Genio Community forum.

Note

The performance statistics shown in these tables were measured using NPU Execution Provider with performance mode enabled across different Genio products, models, and data types.

Legacy Analytical Models

Legacy analytical models are classic vision backbones and networks widely used only for benchmarking and reference. The accuracy of the model is not addressed.

Models for Detection

Genio 520

Genio 720

Task

Model Name

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Pretrained Model

Object Detection

YOLOv5s

Quant8

640x640

83.512

Download

Object Detection

YOLOv5s

Float32

640x640

35.344

Download

Object Detection

YOLOv8s

Quant8

640x640

120.081

Download

Object Detection

YOLO11s

Quant8

640x640

110.876

Download

Models for Classification

Genio 520

Genio 720

Task

Model Name

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Pretrained Model

Classification

ConvNeXt

Quant8

224x224

353.309

Download

Classification

ConvNeXt

Float32

224x224

591.345

Download

Classification

DenseNet

Quant8

224x224

54.058

Download

Classification

DenseNet

Float32

224x224

13.917

Download

Classification

EfficientNet

Quant8

224x224

13.456

Download

Classification

EfficientNet

Float32

224x224

2.977

Download

Classification

MobileNetV2

Quant8

224x224

5.734

Download

Classification

MobileNetV2

Float32

224x224

1.909

Download

Classification

MobileNetV3

Quant8

224x224

4.176

Download

Classification

MobileNetV3

Float32

224x224

9.393

Download

Classification

ResNet

Quant8

224x224

17.595

Download

Classification

ResNet

Float32

224x224

3.659

Download

Classification

SqueezeNet

Quant8

224x224

17.676

Download

Classification

SqueezeNet

Float32

224x224

15.313

Download

Classification

VGG

Quant8

224x224

146.614

Download

Classification

VGG

Float32

224x224

33.738

Download

Models for Recognition

Genio 520

Genio 720

Task

Model Name

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Pretrained Model

Recognition

VGGFace

Quant8

224x224

146.057

Download

Recognition

VGGFace

Float32

224x224

33.762

Download

Robotic Models

Robotic models target robotic perception and control workloads, such as grasping, navigation, or policy learning. These models demonstrate the performance of ONNX Runtime on Genio platforms for specialized robotic tasks.

Models for Robotic

Genio 520

Genio 720

Task

Model Name

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Pretrained Model

Omni6DPose

scale_policy

Float32

0.029

Download

Dino

dino_vitb8

Float32

59.66

Download

Diffusion Policy

model_diffusion_sampling

Float32

15.16

Download

MobileSam

mobilesam_decoder

Float32

12.2

Download

MobileSam

mobilesam_encoder

Float32

57.81

Download

RegionNormalizedGrasp

anchornet

Float32

25.46

Download

RegionNormalizedGrasp

localnet

Float32

6.23

Download

YoloWorld

yoloworld_xl

Float32

263.01

Download

Performance Notes and Limitations

Note

The measurements were obtained using onnxruntime_perf_test, and each model’s performance can vary depending on:

  • The specific Genio platform and hardware configuration.

  • The version of the board image and evaluation kit (EVK).

  • The selected backend and model variant.

To obtain the most accurate performance numbers for your use case, you must run the application directly on the target platform.

Current Limitations

ONNX-GAI (generative AI) models are not officially supported at this time. Some proof-of-concept experiments exist internally, but they are not production-ready and are not part of the validated model list.

For the latest roadmap or early-access updates on ONNX-GAI support, please refer to the Genio Community forum.