Performance Benchmarks

This page provides a centralized reference for evaluating AI inference performance across MediaTek Genio platforms, which aggregates benchmarking results for various AI workloads, including analytical AI and generative AI, across multiple inference frameworks such as TFLite (LiteRT) and ONNX Runtime.

Important

For more platform-specific details and comprehensive performance data, please refer to the Model Zoo.

AI Supporting Scope

The following table summarizes the AI capabilities and framework support across different MediaTek Genio platforms.

AI Supporting Scope Across Different Platform

Platform

OS

TFLite - Analytical AI (Online)

TFLite - Analytical AI (Offline)

TFLite - Generative AI

ONNX Runtime - Analytical AI

Genio 520/720

Android

CPU + GPU + NPU

NPU

NPU

CPU + NPU

Yocto

CPU + GPU + NPU

NPU

X (ETA: 2026/Q2)

CPU + NPU

Genio 510/700

Android

CPU + GPU + NPU

NPU

X

X

Yocto

CPU + GPU + NPU

NPU

X

CPU

Ubuntu

CPU + GPU + NPU

NPU

X

X

Genio 1200

Android

CPU + GPU + NPU

NPU

X

X

Yocto

CPU + GPU + NPU

NPU

X

CPU

Ubuntu

CPU + GPU + NPU

NPU

X

X

Genio 350

Android

CPU + GPU + NPU

X

X

X

Yocto

CPU + GPU

X

X

CPU

Ubuntu

CPU + GPU

X

X

X

TFLite(LiteRT) - Analytical AI

The following tables list the validated TFLite analytical models and their performance across Genio platforms. The statistics were measured using offline inference with performance mode enabled.

Models for Detection

Genio 520

Genio 720

Genio 510

Genio 700

Genio 1200

MT8893

Task

Model Name

Source model type

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Detail

Object Detection

YOLOv5s

.pt

Quant8

640x640

5.35

5.39

17.47

10.04

19.05

3.42

Link

Object Detection

YOLOv5s

.pt

Float32

640x640

16.37

16.23

46.41

32.04

36.66

11.4

Link

Object Detection

YOLOv8s

.pt

Quant8

640x640

7.85

7.63

25.51

17.01

28.04

5.64

Link

Object Detection

YOLOv8s

.pt

Float32

640x640

24.22

31.07

70.95

50.04

55.84

16.34

Link

Models for Classification

Genio 520

Genio 720

Genio 510

Genio 700

Genio 1200

MT8893

Task

Model Name

Source model type

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Detail

Classification

DenseNet

.pt

Quant8

224x224

3.68

4.08

7.03

5.03

6.04

2.4

Link

Classification

DenseNet

.pt

Float32

224x224

7.51

8.41

16.51

11.04

12.04

4.74

Link

Classification

EfficientNet

.pt

Quant8

224x224

1.49

1.9

4.05

3

3.05

1.16

Link

Classification

EfficientNet

.pt

Float32

224x224

3.03

3.73

9.03

6.04

6.05

2.21

Link

Classification

MobileNetV2

.pt

Quant8

224x224

1.04

1.19

1.37

1.04

1.04

0.78

Link

Classification

MobileNetV2

.pt

Float32

224x224

1.89

2.17

3.57

2.04

2.58

1.29

Link

Classification

MobileNetV3

.pt

Quant8

224x224

0.73

1.04

1.04

0.04

N/A

0.64

Link

Classification

MobileNetV3

.pt

Float32

224x224

1.19

1.56

2.72

1.05

2.05

0.97

Link

Classification

ResNet

.pt

Quant8

224x224

1.5

1.65

2.79

2.03

2.05

1.08

Link

Classification

ResNet

.pt

Float32

224x224

3.82

5.19

9.21

6.04

8.04

2.56

Link

Classification

SqueezeNet

.pt

Quant8

224x224

1.19

1.77

1.52

1.04

1.05

0.86

Link

Classification

SqueezeNet

.pt

Float32

224x224

2.19

3.19

5.01

3.04

3.05

1.64

Link

Classification

VGG

.pt

Quant8

224x224

10.91

12.74

24.85

17.04

24.04

6.47

Link

Classification

VGG

.pt

Float32

224x224

33.42

43.34

80.3

56.04

49.05

19.87

Link

Models for Recognition

Genio 520

Genio 720

Genio 510

Genio 700

Genio 1200

MT8893

Task

Model Name

Source model type

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Detail

Recognition

VGGFace

.pt

Quant8

224x224

11.23

13.34

25.04

17.04

24.04

6.58

Link

Recognition

VGGFace

.pt

Float32

224x224

34.05

44.35

81.7

56.04

49.05

20

Link

TFLite(LiteRT) - Generative AI

For Generative AI workloads, the following tables provide representative performance data for reference and platform capability validation.

LLM Performance Comparison

Prompt Mode Comparison (Unit: tok/s)

Model

Genio 720

Genio 520

MT8893

DeepSeek-R1-Distill-Llama-8B

36.653

29.322

425.791

DeepSeek-R1-Distill-Qwen-1.5B

341.686

273.349

1057.25

DeepSeek-R1-Distill-Qwen-7B

69.23

55.384

448.167

gemma2-2b-it

193.392

154.714

891.004

internlm2-chat-1_8b

276.218

220.974

1544.7

llama3-8b

56.495

45.196

426.125

llama3.2-1B-Instruct

401.288

321.03

2093.61

llama3.2-3B-Instruct

154.557

123.646

1022.95

Qwen2-0.5B-Instruct

762.455

609.964

3010.84

Qwen2-1.5B-Instruct

341.993

273.594

1616.22

Qwen2-7B-Instruct

70.416

56.333

474.383

Qwen1.5-1.8B-Chat

310.639

248.511

1516.5

Qwen2.5-1.5B-Instruct

341.418

273.134

1621.85

Qwen2.5-3B-Instruct

162.481

120

751.056

Qwen2.5-7B-Instruct

70.548

56.438

471.945

Qwen3 1.7B

233.032

186.426

1069.16

Phi-3-mini-4k-instruct

129.6

103.68

828.868

MiniCPM-2B-sft-bf16-llama-format

194.793

155.834

886.721

medusa_v1_0_vicuna_7b_v1.5

91.821

73.457

501.053

vicuna1.5-7b-tree-speculative-decoding-plus

84.895

67.916

454.583

llava1.5-7b-speculative-decoding

73.103

58.482

267.981

baichuan-7b-int8-cache

81.184

64.947

561.762

baichuan-7b

79.745

63.796

536.642

Generative Mode Comparison (Unit: tok/s)

Model

Genio 720

Genio 520

MT8893

DeepSeek-R1-Distill-Llama-8B

4.578

3.662

11.359

DeepSeek-R1-Distill-Qwen-1.5B

11.764

9.411

25.681

DeepSeek-R1-Distill-Qwen-7B

4.677

3.742

11.693

gemma2-2b-it

8.752

7.002

21.372

internlm2-chat-1_8b

17.549

14.039

42.393

llama3-8b

4.698

3.758

11.512

llama3.2-1B-Instruct

24.533

19.626

61.144

llama3.2-3B-Instruct

10.577

8.462

25.048

Qwen2-0.5B-Instruct

50.06

40.048

77.871

Qwen2-1.5B-Instruct

19.563

15.65

38.314

Qwen2-7B-Instruct

4.883

3.906

11.642

Qwen1.5-1.8B-Chat

9.895

7.916

31.383

Qwen2.5-1.5B-Instruct

18.427

14.742

38.574

Qwen2.5-3B-Instruct

10.31

7.84

20.868

Qwen2.5-7B-Instruct

4.892

3.914

11.739

Qwen3 1.7B

10.911

8.729

23.424

Phi-3-mini-4k-instruct

7.324

5.859

18.869

MiniCPM-2B-sft-bf16-llama-format

7.694

6.155

22.275

medusa_v1_0_vicuna_7b_v1.5

10.564

8.451

22.787

vicuna1.5-7b-tree-speculative-decoding-plus

12.6489

10.119

22.722

llava1.5-7b-speculative-decoding

7.281

5.825

6.779

baichuan-7b-int8-cache

4.239

3.391

11.37

baichuan-7b

4.182

3.346

10.56

VLM Performance Comparison

ViT Inference Time (Unit: s)

Model

Genio 720

Genio 520

MT8893

Qwen2.5 VL 3B

0.208

0.26

0.096

InternVL3-1B

1.744

2.18

0.508

Prompt Mode (Unit: tok/s)

Model

Genio 720

Genio 520

MT8893

Qwen2.5 VL 3B

100.065

80.052

339.901

InternVL3-1B

74.748

59.798

183.641

Generative Mode (Unit: tok/s)

Model

Genio 720

Genio 520

MT8893

Qwen2.5 VL 3B

4.776

3.821

10.1337

InternVL3-1B

6.157

4.926

14.094

Stable Diffusion Performance Comparison

Main Time Comparison (Unit: ms)

Model

Genio 720

Genio 520

MT8893

Stable Diffusion v.1.5

25816

32270

7075

Stable Diffusion v.1.5 controlnet

33642

42053

9395

Stable_diffusion_v1_5_controlnet_lora

34148

42685

10268

Stable_diffusion_v1.5_2lora

35978

44973

11487

Stable Diffusion v2.1 base model with controlnet

31183

38979

6969

Stable Diffusion v1.5 LCM Ipadaptor

10645

13306

2254

Stable_diffusion_lcm_multiDiffusion

29103.565

36379.456

7438.723

Inference Time Comparison (Unit: ms)

Model

Genio 720

Genio 520

MT8893

Stable Diffusion v.1.5

24813

31016

6132

Stable Diffusion v.1.5 controlnet

32294

40368

8035

Stable_diffusion_v1_5_controlnet_lora

32454

40568

8472

Stable_diffusion_v1.5_2lora

33195

41494

10130

Stable Diffusion v2.1 base model with controlnet

29828

37285

5451

Stable Diffusion v1.5 LCM Ipadaptor

5861

7326

1077

Stable_diffusion_lcm_multiDiffusion

28126.856

35158.57

6697.967

CLIP Performance Comparison

Main Time Comparison (Unit: ms)

Model

Genio 720

Genio 520

MT8893

img_encoder_proj_clip_vit_large_dynamic

567.61

709.513

358.609

img_encoder_proj_openclip_vit_big_g_dynamic

12035.52

15044.4

1390.56

img_encoder_proj_openclip_vit_h_dynamic

1440.197

1800.246

591.931

text_encoder_clip_vit_large

455.079

568.849

308.718

text_encoder_openclip_vit_h

750.703

938.379

510.919

Inference Time Comparison (Unit: ms)

Model

Genio 720

Genio 520

MT8893

img_encoder_proj_clip_vit_large_dynamic

257.388

321.735

51.135

img_encoder_proj_openclip_vit_big_g_dynamic

3142.959

3928.699

517.126

img_encoder_proj_openclip_vit_h_dynamic

881.647

1102.059

147.467

text_encoder_clip_vit_large

38.993

48.741

18.938

text_encoder_openclip_vit_h

119.77

149.713

48.485

ONNX Runtime - Analytical AI

The following tables list ONNX models validated on Genio platforms. Measurements were obtained using the NPU Execution Provider (where available) with performance mode enabled.

Models for TAO Related

Genio 520

Genio 720

Task

Model Name

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Pretrained Model

Object Detection

PeopleNet (ResNet34)

Float32

114

Download

Object Detection

PeopleNet (ResNet34)

Float32

42

Download

Recognition

Action Recognition Net (ResNet18)

Float32

27

Download

Pose Estimation

BodyPoseNet

Float32

72

Download

Object Detection

LPDNet (USA Pruned)

Float32

6.5

Download

Segmentation

PeopleSemSegNet_AMR

Float32

932

Download

Segmentation

PeopleSemSegNet_AMR (Rel)

Float32

54

Download

Segmentation

PeopleSemSegNet (ShuffleSeg)

Float32

56

Download

Segmentation

PeopleSemSegNet (Vanilla Unet)

Float32

1138

Download

Re-Identification

ReIdentificationNet (ResNet50)

Float32

11

Download

Classification

Retail Object Recognition

Float32

34

Download

OCR

Ocrnet_resnet50

Float32

39

Download

OCR

Ocrnet_resnet50 (Pruned)

Float32

34

Download

OCR

ocd_resnet50

Float32

736x1280

700

Download

OCR

ocd_resnet50

Float32

640x640

323

Download

OCR

ocdnet_mixnet

Float32

640x640

1116

Download

Classification

Pose Classification (ST-GCN)

Float32

352

Download

Pose Estimation

Centerpose (Chair DLA34)

Float32

777

Download

Pose Estimation

Centerpose (Camera FAN)

Float32

5741

Download

Object Detection

LPDNet (CCPD Pruned)

Float32

12

Download

Pose Estimation

Foundation Pose (Refiner)

Float32

68.7

Download

Pose Estimation

Foundation Pose (Score)

Float32

37.1

Download

Pose Estimation

Multi 3D Centerpose

Float32

442.7

Download

Models for Detection

Genio 520

Genio 720

Task

Model Name

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Pretrained Model

Object Detection

YOLOv5s

Quant8

640x640

83.512

Download

Object Detection

YOLOv5s

Float32

640x640

35.344

Download

Object Detection

YOLOv8s

Quant8

640x640

120.081

Download

Object Detection

YOLO11s

Quant8

640x640

110.876

Download

Models for Classification

Genio 520

Genio 720

Task

Model Name

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Pretrained Model

Classification

ConvNeXt

Quant8

224x224

353.309

Download

Classification

ConvNeXt

Float32

224x224

591.345

Download

Classification

DenseNet

Quant8

224x224

54.058

Download

Classification

DenseNet

Float32

224x224

13.917

Download

Classification

EfficientNet

Quant8

224x224

13.456

Download

Classification

EfficientNet

Float32

224x224

2.977

Download

Classification

MobileNetV2

Quant8

224x224

5.734

Download

Classification

MobileNetV2

Float32

224x224

1.909

Download

Classification

MobileNetV3

Quant8

224x224

4.176

Download

Classification

MobileNetV3

Float32

224x224

9.393

Download

Classification

ResNet

Quant8

224x224

17.595

Download

Classification

ResNet

Float32

224x224

3.659

Download

Classification

SqueezeNet

Quant8

224x224

17.676

Download

Classification

SqueezeNet

Float32

224x224

15.313

Download

Classification

VGG

Quant8

224x224

146.614

Download

Classification

VGG

Float32

224x224

33.738

Download

Models for Recognition

Genio 520

Genio 720

Task

Model Name

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Pretrained Model

Recognition

VGGFace

Quant8

224x224

146.057

Download

Recognition

VGGFace

Float32

224x224

33.762

Download

Models for Robotic

Genio 520

Genio 720

Task

Model Name

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Pretrained Model

Omni6DPose

scale_policy

Float32

0.029

Download

Dino

dino_vitb8

Float32

59.66

Download

Diffusion Policy

model_diffusion_sampling

Float32

15.16

Download

MobileSam

mobilesam_decoder

Float32

12.2

Download

MobileSam

mobilesam_encoder

Float32

57.81

Download

RegionNormalizedGrasp

anchornet

Float32

25.46

Download

RegionNormalizedGrasp

localnet

Float32

6.23

Download

YoloWorld

yoloworld_xl

Float32

263.01

Download

Performance Notes

Performance can vary depending on:

  • The specific Genio platform and hardware configuration.

  • The version of the board image and evaluation kit (EVK).

  • The selected backend and model variant.

To obtain the most accurate performance numbers for your use case, you must run the application directly on the target platform.