Performance Benchmarks

This page provides a centralized reference for evaluating AI inference performance across MediaTek Genio platforms, which aggregates benchmarking results for various AI workloads, including analytical AI and generative AI, across multiple inference frameworks such as TFLite (LiteRT) and ONNX Runtime.

Important

For more platform-specific details and comprehensive performance data, please refer to the Model Zoo.

AI Supporting Scope

The following table summarizes the AI capabilities and framework support across different MediaTek Genio platforms.

AI Supporting Scope Across Different Platform

Platform

OS

TFLite - Analytical AI (Online)

TFLite - Analytical AI (Offline)

TFLite - Generative AI

ONNX Runtime - Analytical AI

Genio 520/720

Android

CPU + GPU + NPU

NPU

NPU

CPU + NPU

Yocto

CPU + GPU + NPU

NPU

X (ETA: 2026/Q2)

CPU + NPU

Genio 510/700

Android

CPU + GPU + NPU

NPU

X

X

Yocto

CPU + GPU + NPU

NPU

X

CPU

Ubuntu

CPU + GPU + NPU

NPU

X

X

Genio 1200

Android

CPU + GPU + NPU

NPU

X

X

Yocto

CPU + GPU + NPU

NPU

X

CPU

Ubuntu

CPU + GPU + NPU

NPU

X

X

Genio 350

Android

CPU + GPU + NPU

X

X

X

Yocto

CPU + GPU

X

X

CPU

Ubuntu

CPU + GPU

X

X

X

TFLite(LiteRT) - Analytical AI

The following tables list the validated TFLite analytical models and their performance across Genio platforms. The statistics were measured using offline inference with performance mode enabled.

Models for Detection

Genio 360

Genio 520

Genio 720

Genio 510

Genio 700

Genio 1200

MT8893

Task

Model Name

Source model type

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Detail

Object Detection

YOLOv5s

.pt

Quant8

640x640

8.68

6.32

5.35

16.00

10.10

19.00

3.42

Link

Object Detection

YOLOv5s

.pt

Float32

640x640

28.25

18.78

16.66

44.90

32.30

37.00

11.40

Link

Object Detection

YOLOv8s

.pt

Quant8

640x640

12.92

9.31

8.04

24.00

17.00

28.40

5.64

Link

Object Detection

YOLOv8s

.pt

Float32

640x640

42.41

27.13

24.41

69.30

50.30

54.60

16.34

Link

Models for Classification

Genio 360

Genio 520

Genio 720

Genio 510

Genio 700

Genio 1200

MT8893

Task

Model Name

Source model type

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Detail

Classification

DenseNet

.pt

Quant8

224x224

5.86

4.46

3.83

7.30

5.30

6.00

2.40

Link

Classification

DenseNet

.pt

Float32

224x224

12.07

9.21

7.72

16.30

11.30

12.40

4.74

Link

Classification

EfficientNet

.pt

Quant8

224x224

2.20

1.83

1.56

4.00

3.00

3.00

1.16

Link

Classification

EfficientNet

.pt

Float32

224x224

4.85

3.57

3.15

9.30

6.00

6.10

2.21

Link

Classification

MobileNetV2

.pt

Quant8

224x224

1.28

1.13

0.97

1.00

1.00

1.00

0.78

Link

Classification

MobileNetV2

.pt

Float32

224x224

2.60

2.06

1.71

3.00

2.00

2.10

1.29

Link

Classification

MobileNetV3

.pt

Quant8

224x224

0.91

0.80

0.74

1.00

0.40

0.90

0.64

Link

Classification

MobileNetV3

.pt

Float32

224x224

1.67

1.36

1.20

2.00

1.10

2.00

0.97

Link

Classification

ResNet

.pt

Quant8

224x224

2.25

1.71

1.48

2.00

2.00

2.10

1.08

Link

Classification

ResNet

.pt

Float32

224x224

6.20

4.35

3.86

9.30

6.30

8.40

2.56

Link

Classification

SqueezeNet

.pt

Quant8

224x224

1.55

1.21

1.11

1.00

1.00

1.00

0.86

Link

Classification

SqueezeNet

.pt

Float32

224x224

3.57

2.49

2.31

4.10

3.00

3.00

1.64

Link

Classification

VGG

.pt

Quant8

224x224

17.72

13.47

11.17

N/A

N/A

24.40

6.47

Link

Classification

VGG

.pt

Float32

224x224

69.25

38.44

33.17

80.30

56.30

50.40

19.87

Link

Models for Recognition

Genio 360

Genio 520

Genio 720

Genio 510

Genio 700

Genio 1200

MT8893

Task

Model Name

Source model type

Data Type

Input Size

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Inference Time (ms)

Detail

Recognition

VGGFace

.pt

Quant8

224x224

18.21

13.89

11.50

N/A

N/A

24.40

6.58

Link

Recognition

VGGFace

.pt

Float32

224x224

71.01

39.26

33.80

81.30

56.30

49.50

20.00

Link

TFLite(LiteRT) - Generative AI

For Generative AI workloads, the following tables provide representative performance data for reference and platform capability validation.

LLM Performance Comparison

Prompt Mode Comparison (Unit: tok/s)

Model

Genio 720

Genio 520

MT8893

DeepSeek-R1-Distill-Llama-8B

36.65

Not Support

425.79

DeepSeek-R1-Distill-Qwen-1.5B

341.69

276.82

1057.25

DeepSeek-R1-Distill-Qwen-7B

69.23

Not Support

448.17

gemma2-2b-it

193.39

Not Support

891.00

internlm2-chat-1.8b

276.22

243.61

1544.70

llama3-8b

56.50

Not Support

426.13

llama3.2-1B-Instruct

401.29

335.92

2093.61

llama3.2-3B-Instruct

154.56

Not Support

1022.95

Qwen2-0.5B-Instruct

762.46

641.70

3010.84

Qwen2-1.5B-Instruct

341.99

274.50

1616.22

Qwen2-7B-Instruct

70.42

Not Support

474.38

Qwen1.5-1.8B-Chat

310.64

268.23

1516.50

Qwen2.5-1.5B-Instruct

341.42

269.87

1621.85

Qwen2.5-3B-Instruct

162.48

Not Support

751.06

Qwen2.5-7B-Instruct

70.55

Not Support

471.95

Qwen3 1.7B

233.03

229.63

1069.16

Phi-3-mini-4k-instruct

129.60

Not Support

828.87

MiniCPM-2B-sft-bf16-llama-format

194.79

153.14

886.72

medusa_v1_0_vicuna_7b_v1.5

91.82

Not Support

501.05

vicuna1.5-7b-tree-speculative-decoding-plus

84.90

Not Support

454.58

llava1.5-7b-speculative-decoding

73.10

Not Support

267.98

baichuan-7b-int8-cache

81.18

Not Support

561.76

baichuan-7b

79.75

Not Support

536.64

Generative Mode Comparison (Unit: tok/s)

Model

Genio 720

Genio 520

MT8893

DeepSeek-R1-Distill-Llama-8B

4.58

Not Support

11.36

DeepSeek-R1-Distill-Qwen-1.5B

11.76

9.18

25.68

DeepSeek-R1-Distill-Qwen-7B

4.68

Not Support

11.69

gemma2-2b-it

8.75

Not Support

21.37

internlm2-chat-1.8b

17.55

14.48

42.39

llama3-8b

4.70

Not Support

11.51

llama3.2-1B-Instruct

24.53

20.59

61.14

llama3.2-3B-Instruct

10.58

Not Support

25.05

Qwen2-0.5B-Instruct

50.06

42.43

77.87

Qwen2-1.5B-Instruct

19.56

15.32

38.31

Qwen2-7B-Instruct

4.88

Not Support

11.64

Qwen1.5-1.8B-Chat

9.90

8.63

31.38

Qwen2.5-1.5B-Instruct

18.43

15.11

38.57

Qwen2.5-3B-Instruct

10.31

Not Support

20.87

Qwen2.5-7B-Instruct

4.89

Not Support

11.74

Qwen3 1.7B

10.91

10.53

23.42

Phi-3-mini-4k-instruct

7.32

Not Support

18.87

MiniCPM-2B-sft-bf16-llama-format

7.69

6.48

22.28

medusa_v1_0_vicuna_7b_v1.5

10.56

Not Support

22.79

vicuna1.5-7b-tree-speculative-decoding-plus

12.65

Not Support

22.72

llava1.5-7b-speculative-decoding

7.28

Not Support

6.78

baichuan-7b-int8-cache

4.24

Not Support

11.37

baichuan-7b

4.18

Not Support

10.56

VLM Performance Comparison

ViT Inference Time (Unit: s)

Model

Genio 720

Genio 520

MT8893

Qwen2.5 VL 3B

0.21

Not Support

0.10

InternVL3-1B

1.74

1.94

0.51

Prompt Mode (Unit: tok/s)

Model

Genio 720

Genio 520

MT8893

Qwen2.5 VL 3B

100.07

Not Support

339.90

InternVL3-1B

74.75

65.23

183.64

Generative Mode (Unit: tok/s)

Model

Genio 720

Genio 520

MT8893

Qwen2.5 VL 3B

4.78

Not Support

10.13

InternVL3-1B

6.16

4.52

14.09

Stable Diffusion Performance Comparison

Main Time Comparison (Unit: ms)

Model

Genio 720

Genio 520

MT8893

Stable Diffusion v.1.5

25816

33754

7075

Stable Diffusion v.1.5 controlnet

33642

47923

9395

Stable_diffusion_v1_5_controlnet_lora

34148

Not Support

10268

Stable_diffusion_v1.5_2lora

35978

45619

11487

Stable Diffusion v2.1 base model with controlnet

31183

Not Support

6969

Stable Diffusion v1.5 LCM Ipadaptor

10645

11378

2254

Stable_diffusion_lcm_multiDiffusion

29104

34731

7439

Inference Time Comparison (Unit: ms)

Model

Genio 720

Genio 520

MT8893

Stable Diffusion v.1.5

24813

29001

6132

Stable Diffusion v.1.5 controlnet

32294

37581

8035

Stable_diffusion_v1_5_controlnet_lora

32454

Not Support

8472

Stable_diffusion_v1.5_2lora

33195

39198

10130

Stable Diffusion v2.1 base model with controlnet

29828

Not Support

5451

Stable Diffusion v1.5 LCM Ipadaptor

5861

4978

1077

Stable_diffusion_lcm_multiDiffusion

28127

32585

6698

CLIP Performance Comparison

Main Time Comparison (Unit: ms)

Model

Genio 720

Genio 520

MT8893

img_encoder_proj_clip_vit_large_dynamic

567.61

662.22

358.61

img_encoder_proj_openclip_vit_big_g_dynamic

12035.52

Not Support

1390.56

img_encoder_proj_openclip_vit_h_dynamic

1440.20

Not Support

591.93

text_encoder_clip_vit_large

455.08

748.45

308.72

text_encoder_openclip_vit_h

750.70

Not Support

510.92

Inference Time Comparison (Unit: ms)

Model

Genio 720

Genio 520

MT8893

img_encoder_proj_clip_vit_large_dynamic

257.39

296.84

51.14

img_encoder_proj_openclip_vit_big_g_dynamic

3142.96

Not Support

517.13

img_encoder_proj_openclip_vit_h_dynamic

881.65

Not Support

147.47

text_encoder_clip_vit_large

38.99

45.56

18.94

text_encoder_openclip_vit_h

119.77

Not Support

48.49

ONNX Runtime - Analytical AI

The following tables list ONNX models validated on Genio platforms. Measurements were obtained using the NPU Execution Provider (where available) with performance mode enabled.

Models for TAO Related

Task

Model Name

Data Type

Input Size

G520 NPU (ms)

G520 CPU (ms)

G720 NPU (ms)

G720 CPU (ms)

Pretrained Model

Object Detection

PeopleNet (ResNet34)

Float32

3x544x960

83.51

3460.66

77.63

3422.34

Download

Object Detection

PeopleNet (ResNet34)

Quant8

3x544x960

20.49

911.40

19.44

901.13

Download

Recognition

Action Recognition Net (ResNet18)

Float32

96x224x224

16.83

353.46

14.75

350.78

Download

Pose Estimation

BodyPoseNet

Float32

224x320x3

43.99

1845.36

40.56

1825.80

Download

Object Detection

LPDNet (USA Pruned)

Float32

3x480x640

3.68

107.06

3.42

106.47

Download

Segmentation

PeopleSemSegNet_AMR

Float32

3x576x960

Not Support

7685.97

Not Support

7749.61

Download

Segmentation

PeopleSemSegNet_AMR (Rel)

Float32

3x544x960

15.05

146.83

13.29

136.69

Download

Segmentation

PeopleSemSegNet (ShuffleSeg)

Float32

3x544x960

15.08

140.38

13.41

136.57

Download

Segmentation

PeopleSemSegNet (Vanilla Unet)

Float32

3x544x960

178.51

7510.24

163.45

7346.95

Download

Re-Identification

ReIdentificationNet (ResNet50)

Float32

3x256x128

8.59

237.53

6.87

234.10

Download

OCR

Ocrnet_resnet50

Float32

1x32x100

20.64

300.25

18.16

296.89

Download

OCR

Ocrnet_resnet50 (Pruned)

Float32

1x32x100

14.93

179.51

13.94

175.70

Download

OCR

ocd_resnet50

Float32

3x736x1280

169.41

6520.78

149.85

6340.07

Download

OCR

ocd_resnet50

Float32

3x640x640

76.18

2809.26

68.10

2748.75

Download

OCR

ocdnet_mixnet

Float32

3x640x640

362.87

17742.48

340.09

17436.33

Download

Classification

Pose Classification (ST-GCN)

Float32

3x300x34x1

223.89

787.40

207.00

772.19

Download

Pose Estimation

Centerpose (Chair DLA34)

Float32

3x512x512

Not Support

3035.51

Not Support

2946.96

Download

Pose Estimation

Centerpose (Camera FAN)

Float32

3x512x512

Not Support

7689.45

Not Support

7568.34

Download

Object Detection

LPDNet (CCPD Pruned)

Float32

3x1168x720

7.82

190.42

6.73

186.26

Download

Pose Estimation

Foundation Pose (Refiner)

Float32

6x160x160

64.91

682.93

60.97

674.01

Download

Pose Estimation

Foundation Pose (Score)

Float32

6x160x160

37.37

622.71

34.63

615.33

Download

Pose Estimation

Multi 3D Centerpose

Float32

3x512x512

Not Support

3024.04

Not Support

3005.77

Download

Models for Detection

Task

Model Name

Data Type

Input Size

G520 NPU (ms)

G520 CPU (ms)

G720 NPU (ms)

G720 CPU (ms)

Pretrained Model

Object Detection

YOLOv5s

Quant8

640x640

Not Support

225.74

Not Support

221.29

Download

Object Detection

YOLOv5s

Float32

640x640

36.50

607.68

32.37

586.80

Download

Object Detection

YOLOv8s

Quant8

640x640

90.11

353.19

80.57

346.58

Download

Object Detection

YOLO11s

Quant8

640x640

102.15

301.50

90.99

295.32

Download

Models for Classification

Task

Model Name

Data Type

Input Size

G520 NPU (ms)

G520 CPU (ms)

G720 NPU (ms)

G720 CPU (ms)

Pretrained Model

Classification

ConvNeXt

Quant8

224x224

Not Support

516.21

Not Support

1115.18

Download

Classification

ConvNeXt

Float32

224x224

Not Support

1117.20

Not Support

510.37

Download

Classification

DenseNet

Quant8

224x224

Not Support

104.51

Not Support

103.30

Download

Classification

DenseNet

Float32

224x224

8.46

205.29

7.49

200.32

Download

Classification

EfficientNet

Quant8

224x224

33.33

24.07

30.52

23.94

Download

Classification

EfficientNet

Float32

224x224

3.15

66.64

2.81

65.57

Download

Classification

MobileNetV2

Quant8

224x224

1.43

12.36

1.26

12.23

Download

Classification

MobileNetV2

Float32

224x224

1.75

31.69

1.47

30.41

Download

Classification

MobileNetV3

Quant8

224x224

Not Support

6.30

Not Support

6.16

Download

Classification

MobileNetV3

Float32

224x224

13.72

10.74

12.81

10.45

Download

Classification

ResNet

Quant8

224x224

2.04

45.87

1.78

45.08

Download

Classification

ResNet

Float32

224x224

3.81

112.00

3.49

111.24

Download

Classification

SqueezeNet

Quant8

224x224

9.36

33.08

8.38

31.96

Download

Classification

SqueezeNet

Float32

224x224

9.86

53.15

8.81

52.00

Download

Classification

VGG

Quant8

224x224

13.79

366.54

11.62

366.31

Download

Classification

VGG

Float32

224x224

37.17

902.03

32.24

889.14

Download

Models for Recognition

Task

Model Name

Data Type

Input Size

G520 NPU (ms)

G520 CPU (ms)

G720 NPU (ms)

G720 CPU (ms)

Pretrained Model

Recognition

VGGFace

Quant8

224x224

291.44

366.43

291.22

367.59

Download

Recognition

VGGFace

Float32

224x224

37.98

904.24

32.84

891.02

Download

Models for Robotic

Task

Model Name

Data Type

Input Size

G520 NPU (ms)

G520 CPU (ms)

G720 NPU (ms)

G720 CPU (ms)

Pretrained Model

Omni6DPose

scale_policy

Float32

1x3x3

Not Support

0.18

Not Support

0.17

Download

Diffusion Policy

model_diffusion_sampling

Float32

trajectory:1x16x12, global_cond:1x800

60.66

44.71

57.68

41.96

Download

MobileSam

mobilesam_encoder

Float32

3x448x448

Not Support

705.14

Not Support

694.74

Download

RegionNormalizedGrasp

anchornet

Float32

4x640x360

13.42

187.66

12.13

183.82

Download

RegionNormalizedGrasp

localnet

Float32

64x64x6

20.02

19.65

19.78

20.06

Download

YoloWorld

yoloworld_xl

Float32

3x640x640

465.61

11466.42

403.15

11214.33

Download

Performance Notes

Performance can vary depending on:

  • The specific Genio platform and hardware configuration.

  • The version of the board image and evaluation kit (EVK).

  • The selected backend and model variant.

To obtain the most accurate performance numbers for your use case, you must run the application directly on the target platform.