TFLite(LiteRT) - Generative AI on Genio 520

This page lists the generative AI models and representative performance data for Genio 520 platforms. For background information about generative workloads and usage notes, refer to TFlite(LiteRT) - Generative AI.

Model Support and Performance

Note

For VLM, Qwen3-vl is to be released by the end of June,2026.

The following tables summarize the supported generative models and measured performance on Genio 520.

Large Language Models (LLMs)
Model	Prompt Mode (tok/s)	Generative Mode (tok/s)
DeepSeek-R1-Distill-Qwen-1.5B	276.82	9.18
DeepSeek-R1-Distill-Qwen-7B	Not Support	Not Support
DeepSeek-R1-Distill-Llama-8B	Not Support	Not Support
Qwen3-1.7B	229.63	10.53
Qwen2.5-1.5B-Instruct	269.87	15.11
Qwen2.5-3B-Instruct	Not Support	Not Support
Qwen2.5-7B-Instruct	Not Support	Not Support
gemma3-1B (Text Only)	491.53	22.13
gemma2-2b-it	Not Support	Not Support
llama3.2-1B-Instruct	335.92	20.59
llama3.2-3B-Instruct	Not Support	Not Support
llama3-8b	Not Support	Not Support
MiniCPM-2B-sft-bf16-llama-format	153.14	6.48
llava1.5-7b-speculative-decoding	Not Support	Not Support
medusa_v1_0_vicuna_7b_v1.5	Not Support	Not Support
vicuna1.5-7b-tree-speculative-decoding-plus	Not Support	Not Support
baichuan-7b-int8-cache	Not Support	Not Support

Stable Diffusion and Image Generation
Model	Main Time (ms)	Inference Time (ms)
Stable Diffusion v.1.5 controlnet	47923	37581
Stable Diffusion v2.1 base model with controlnet	Not Support	Not Support

CLIP and Embedding Models
Model	Main Time (ms)	Inference Time (ms)
CLIP Image Encoder
img_encoder_proj_clip_vit_large_dynamic	662.22	296.84
img_encoder_proj_openclip_vit_big_g_dynamic	Not Support	Not Support
img_encoder_proj_openclip_vit_h_dynamic	Not Support	Not Support
CLIP Text Encoder
text_encoder_clip_vit_large	748.45	45.56
text_encoder_openclip_vit_h	Not Support	Not Support