TFLite(LiteRT) - Generative AI on MT8883

This page lists the generative AI models and representative performance data for MT8883 platforms. For background information about generative workloads and usage notes, refer to TFLite(LiteRT) - Generative AI.

Model Support and Performance

Note

The following symbols are used in the tables below:

- : To be released.
Not Support : Platform does not support this model.

Note

For VLM, Qwen3-vl is to be released by the end of June,2026.

The following tables summarize the supported generative models and measured performance on MT8883.

Large Language Models (LLMs)
Model	Prompt Mode (tok/s)	Generative Mode (tok/s)
DeepSeek-R1-Distill-Qwen-1.5B	–	–
DeepSeek-R1-Distill-Qwen-7B	–	–
DeepSeek-R1-Distill-Llama-8B	–	–
Qwen3-1.7B	834.02	25.18
Qwen2.5-1.5B-Instruct	763.66	20.16
Qwen2.5-3B-Instruct	502.05	19.32
Qwen2.5-7B-Instruct	184.73	6.89
gemma3-1B (Text Only)	1125.16	38.87
gemma3-4B (Text-Only)	–	–
gemma2-2b-it	–	–
llama3.2-1B-Instruct	1372.57	43.47
llama3.2-3B-Instruct	611.76	19.89
llama3-8b	128.36	6.53
MiniCPM-2B-sft-bf16-llama-format	–	–
llava1.5-7b-speculative-decoding	138.76	4.46
medusa_v1_0_vicuna_7b_v1.5	–	–
vicuna1.5-7b-tree-speculative-decoding-plus	–	–
baichuan-7b-int8-cache	–	–

Stable Diffusion and Image Generation
Model	Main Time (ms)	Inference Time (ms)
Stable Diffusion v.1.5 controlnet	–	–
Stable Diffusion v2.1 base model with controlnet	–	–

CLIP and Embedding Models
Model	Main Time (ms)	Inference Time (ms)
CLIP Image Encoder
img_encoder_proj_clip_vit_large_dynamic	–	–
img_encoder_proj_openclip_vit_big_g_dynamic	–	–
img_encoder_proj_openclip_vit_h_dynamic	–	–
CLIP Text Encoder
text_encoder_clip_vit_large	–	–
text_encoder_openclip_vit_h	–	–