TFLite(LiteRT) - Generative AI
The Generative Model section provides performance and capability data for large language models (LLMs), vision-language models (VLMs) and image generation models on MediaTek Genio platforms.
This section is intended as a reference for benchmarking and platform capability validation, not as a distribution channel for full training or deployment assets.
Note
For Generative AI workloads, this section provides performance data and capability information only.
Access to the full Generative AI deployment toolkit (GAI toolkit) requires a non-disclosure agreement (NDA) with MediaTek. After signing an NDA, the toolkit can be downloaded from NeuroPilot Document.
Model Categories
The generative models in this section are grouped into the following categories:
Large Language Models (LLMs) – Text-only models for tasks such as dialogue, summarization, and code generation.
Vision-Language Models (VLMs) – Multimodal models that process both images and text (for example, image captioning or visual question answering).
Image Generation and Enhancement – Models such as Stable Diffusion and other diffusion or transformer-based pipelines used for image synthesis, editing, or super-resolution.
Embedding and Encoder Models – Models like CLIP encoders for computing joint image-text embeddings for retrieval or ranking tasks.
Supported Models on Genio Products
Platform-specific model lists and performance data are provided in the following pages:
Performance Notes and Limitations
For Generative AI workloads, measured performance on Genio 520 may be slightly lower than on Genio 720.
This gap is primarily due to DRAM bandwidth differences between the two platforms and might affect:
Token generation speed for LLMs.
End-to-end latency for diffusion-based image generation.
Multimodal pipelines that exchange large intermediate tensors between subsystems.
The following comparative data is provided for reference.
Important
The tables in this section provide representative numbers only. To obtain the most accurate performance for a specific use case, developers must deploy and run the workload directly on the target platform under the intended system configuration.
LLM Performance Comparison
Model |
Genio 720 |
Genio 520 |
MT8893 |
|---|---|---|---|
DeepSeek-R1-Distill-Llama-8B |
36.653 |
29.322 |
425.791 |
DeepSeek-R1-Distill-Qwen-1.5B |
341.686 |
273.349 |
1057.25 |
DeepSeek-R1-Distill-Qwen-7B |
69.23 |
55.384 |
448.167 |
gemma2-2b-it |
193.392 |
154.714 |
891.004 |
internlm2-chat-1_8b |
276.218 |
220.974 |
1544.7 |
llama3-8b |
56.495 |
45.196 |
426.125 |
llama3.2-1B-Instruct |
401.288 |
321.03 |
2093.61 |
llama3.2-3B-Instruct |
154.557 |
123.646 |
1022.95 |
Qwen2-0.5B-Instruct |
762.455 |
609.964 |
3010.84 |
Qwen2-1.5B-Instruct |
341.993 |
273.594 |
1616.22 |
Qwen2-7B-Instruct |
70.416 |
56.333 |
474.383 |
Qwen1.5-1.8B-Chat |
310.639 |
248.511 |
1516.5 |
Qwen2.5-1.5B-Instruct |
341.418 |
273.134 |
1621.85 |
Qwen2.5-3B-Instruct |
162.481 |
120 |
751.056 |
Qwen2.5-7B-Instruct |
70.548 |
56.438 |
471.945 |
Qwen3 1.7B |
233.032 |
186.426 |
1069.16 |
Phi-3-mini-4k-instruct |
129.6 |
103.68 |
828.868 |
MiniCPM-2B-sft-bf16-llama-format |
194.793 |
155.834 |
886.721 |
medusa_v1_0_vicuna_7b_v1.5 |
91.821 |
73.457 |
501.053 |
vicuna1.5-7b-tree-speculative-decoding-plus |
84.895 |
67.916 |
454.583 |
llava1.5-7b-speculative-decoding |
73.103 |
58.482 |
267.981 |
baichuan-7b-int8-cache |
81.184 |
64.947 |
561.762 |
baichuan-7b |
79.745 |
63.796 |
536.642 |
Model |
Genio 720 |
Genio 520 |
MT8893 |
|---|---|---|---|
DeepSeek-R1-Distill-Llama-8B |
4.578 |
3.662 |
11.359 |
DeepSeek-R1-Distill-Qwen-1.5B |
11.764 |
9.411 |
25.681 |
DeepSeek-R1-Distill-Qwen-7B |
4.677 |
3.742 |
11.693 |
gemma2-2b-it |
8.752 |
7.002 |
21.372 |
internlm2-chat-1_8b |
17.549 |
14.039 |
42.393 |
llama3-8b |
4.698 |
3.758 |
11.512 |
llama3.2-1B-Instruct |
24.533 |
19.626 |
61.144 |
llama3.2-3B-Instruct |
10.577 |
8.462 |
25.048 |
Qwen2-0.5B-Instruct |
50.06 |
40.048 |
77.871 |
Qwen2-1.5B-Instruct |
19.563 |
15.65 |
38.314 |
Qwen2-7B-Instruct |
4.883 |
3.906 |
11.642 |
Qwen1.5-1.8B-Chat |
9.895 |
7.916 |
31.383 |
Qwen2.5-1.5B-Instruct |
18.427 |
14.742 |
38.574 |
Qwen2.5-3B-Instruct |
10.31 |
7.84 |
20.868 |
Qwen2.5-7B-Instruct |
4.892 |
3.914 |
11.739 |
Qwen3 1.7B |
10.911 |
8.729 |
23.424 |
Phi-3-mini-4k-instruct |
7.324 |
5.859 |
18.869 |
MiniCPM-2B-sft-bf16-llama-format |
7.694 |
6.155 |
22.275 |
medusa_v1_0_vicuna_7b_v1.5 |
10.564 |
8.451 |
22.787 |
vicuna1.5-7b-tree-speculative-decoding-plus |
12.6489 |
10.119 |
22.722 |
llava1.5-7b-speculative-decoding |
7.281 |
5.825 |
6.779 |
baichuan-7b-int8-cache |
4.239 |
3.391 |
11.37 |
baichuan-7b |
4.182 |
3.346 |
10.56 |
VLM Performance Comparison
Model |
Genio 720 |
Genio 520 |
MT8893 |
|---|---|---|---|
Qwen2.5 VL 3B |
0.208 |
0.26 |
0.096 |
InternVL3-1B |
1.744 |
2.18 |
0.508 |
Model |
Genio 720 |
Genio 520 |
MT8893 |
|---|---|---|---|
Qwen2.5 VL 3B |
100.065 |
80.052 |
339.901 |
InternVL3-1B |
74.748 |
59.798 |
183.641 |
Model |
Genio 720 |
Genio 520 |
MT8893 |
|---|---|---|---|
Qwen2.5 VL 3B |
4.776 |
3.821 |
10.1337 |
InternVL3-1B |
6.157 |
4.926 |
14.094 |
Stable Diffusion Performance Comparison
Model |
Genio 720 |
Genio 520 |
MT8893 |
|---|---|---|---|
Stable Diffusion v.1.5 |
25816 |
32270 |
7075 |
Stable Diffusion v.1.5 controlnet |
33642 |
42053 |
9395 |
Stable_diffusion_v1_5_controlnet_lora |
34148 |
42685 |
10268 |
Stable_diffusion_v1.5_2lora |
35978 |
44973 |
11487 |
Stable Diffusion v2.1 base model with controlnet |
31183 |
38979 |
6969 |
Stable Diffusion v1.5 LCM Ipadaptor |
10645 |
13306 |
2254 |
Stable_diffusion_lcm_multiDiffusion |
29103.565 |
36379.456 |
7438.723 |
Model |
Genio 720 |
Genio 520 |
MT8893 |
|---|---|---|---|
Stable Diffusion v.1.5 |
24813 |
31016 |
6132 |
Stable Diffusion v.1.5 controlnet |
32294 |
40368 |
8035 |
Stable_diffusion_v1_5_controlnet_lora |
32454 |
40568 |
8472 |
Stable_diffusion_v1.5_2lora |
33195 |
41494 |
10130 |
Stable Diffusion v2.1 base model with controlnet |
29828 |
37285 |
5451 |
Stable Diffusion v1.5 LCM Ipadaptor |
5861 |
7326 |
1077 |
Stable_diffusion_lcm_multiDiffusion |
28126.856 |
35158.57 |
6697.967 |
CLIP Performance Comparison
Model |
Genio 720 |
Genio 520 |
MT8893 |
|---|---|---|---|
img_encoder_proj_clip_vit_large_dynamic |
567.61 |
709.513 |
358.609 |
img_encoder_proj_openclip_vit_big_g_dynamic |
12035.52 |
15044.4 |
1390.56 |
img_encoder_proj_openclip_vit_h_dynamic |
1440.197 |
1800.246 |
591.931 |
text_encoder_clip_vit_large |
455.079 |
568.849 |
308.718 |
text_encoder_openclip_vit_h |
750.703 |
938.379 |
510.919 |
Model |
Genio 720 |
Genio 520 |
MT8893 |
|---|---|---|---|
img_encoder_proj_clip_vit_large_dynamic |
257.388 |
321.735 |
51.135 |
img_encoder_proj_openclip_vit_big_g_dynamic |
3142.959 |
3928.699 |
517.126 |
img_encoder_proj_openclip_vit_h_dynamic |
881.647 |
1102.059 |
147.467 |
text_encoder_clip_vit_large |
38.993 |
48.741 |
18.938 |
text_encoder_openclip_vit_h |
119.77 |
149.713 |
48.485 |
Deployment and Source Models
The generative models referenced in this section are primarily intended for benchmarking and capability validation.
Model accuracy and qualitative output quality are not addressed.
MediaTek does not redistribute original training datasets or checkpoint files for third-party or open-source models.
For production deployment, developers must:
Obtain the original models from their official sources,
Follow the applicable licenses and usage terms, and
Perform any required fine-tuning or post-training optimization for your application.