TFLite(LiteRT) - Generative AI on MT8893

This page lists the generative AI models and representative performance data for MT8893 platforms. For background information about generative workloads and usage notes, refer to TFLite(LiteRT) - Generative AI.

Model Support and Performance

The following tables summarize the supported generative models and measured performance on MT8893.

Large Language Models (LLMs)

Model

Prompt Mode (tok/s)

Generative Mode (tok/s)

DeepSeek-R1-Distill-Llama-8B

425.79

11.36

DeepSeek-R1-Distill-Qwen-1.5B

1057.25

25.68

DeepSeek-R1-Distill-Qwen-7B

448.17

11.69

gemma2-2b-it

891.00

21.37

internlm2-chat-1_8b

1544.70

42.39

llama3-8b

426.13

11.51

llama3.2-1B-Instruct

2093.61

61.14

llama3.2-3B-Instruct

1022.95

25.05

Qwen2-0.5B-Instruct

3010.84

77.87

Qwen2-1.5B-Instruct

1616.22

38.31

Qwen2-7B-Instruct

474.38

11.64

Qwen1.5-1.8B-Chat

1516.50

31.38

Qwen2.5-1.5B-Instruct

1621.85

38.57

Qwen2.5-3B-Instruct

751.06

20.87

Qwen2.5-7B-Instruct

471.95

11.74

Qwen3 1.7B

1069.16

23.42

Phi-3-mini-4k-instruct

828.87

18.87

MiniCPM-2B-sft-bf16-llama-format

886.72

22.28

medusa_v1_0_vicuna_7b_v1.5

501.05

22.79

vicuna1.5-7b-tree-speculative-decoding-plus

454.58

22.72

llava1.5-7b-speculative-decoding

267.98

6.78

baichuan-7b-int8-cache

561.76

11.37

baichuan-7b

536.64

10.56

Vision-Language Models (VLMs)

Model

ViT Inference Time (s)

Prompt Mode (tok/s)

Generative Mode (tok/s)

Qwen2.5 VL 3B

0.10

339.90

10.13

InternVL3-1B

0.51

183.64

14.09

Stable Diffusion and Image Generation

Model

Main Time (ms)

Inference Time (ms)

Stable Diffusion v.1.5

7075

6132

Stable Diffusion v.1.5 controlnet

9395

8035

Stable_diffusion_v1_5_controlnet_lora

10268

8472

Stable_diffusion_v1.5_2lora

11487

10130

Stable Diffusion v2.1 base model with controlnet

6969

5451

Stable Diffusion v1.5 LCM Ipadaptor

2254

1077

Stable_diffusion_lcm_multiDiffusion

7439

6698

CLIP and Embedding Models

Model

Main Time (ms)

Inference Time (ms)

img_encoder_proj_clip_vit_large_dynamic

358.61

51.14

img_encoder_proj_openclip_vit_big_g_dynamic

1390.56

517.13

img_encoder_proj_openclip_vit_h_dynamic

591.93

147.47

text_encoder_clip_vit_large

308.72

18.94

text_encoder_openclip_vit_h

510.92

48.49