TFLite(LiteRT) - Generative AI on Genio 520
This page lists the generative AI models and representative performance data for Genio 520 platforms. For background information about generative workloads and usage notes, refer to TFlite(LiteRT) - Generative AI.
Model Support and Performance
Note
The performance values in this section for Genio 520 are estimated by scaling corresponding Genio 720 measurements using approximate bandwidth-based factors. They are intended for early planning and comparison only. For production workloads, developers must benchmark the target models directly on Genio 520 under the intended system configuration.
The following tables summarize the supported generative models and measured performance on Genio 520.
Prompt Mode (tok/s) |
Generative Mode (tok/s) |
|
DeepSeek-R1-Distill-Llama-8B |
29.322 |
3.662 |
DeepSeek-R1-Distill-Qwen-1.5B |
273.349 |
9.411 |
DeepSeek-R1-Distill-Qwen-7B |
55.384 |
3.742 |
gemma2-2b-it |
154.714 |
7.002 |
internlm2-chat-1_8b |
220.974 |
14.039 |
llama3-8b |
45.196 |
3.758 |
llama3.2-1B-Instruct |
321.03 |
19.626 |
llama3.2-3B-Instruct |
123.646 |
8.462 |
Qwen2-0.5B-Instruct |
609.964 |
40.048 |
Qwen2-1.5B-Instruct |
273.594 |
15.65 |
Qwen2-7B-Instruct |
56.333 |
3.906 |
Qwen1.5-1.8B-Chat |
248.511 |
7.916 |
Qwen2.5-1.5B-Instruct |
273.134 |
14.742 |
Qwen2.5-3B-Instruct |
120 |
7.84 |
Qwen2.5-7B-Instruct |
56.438 |
3.914 |
Qwen3 1.7B |
186.426 |
8.729 |
Phi-3-mini-4k-instruct |
103.68 |
5.859 |
MiniCPM-2B-sft-bf16-llama-format |
155.834 |
6.155 |
medusa_v1_0_vicuna_7b_v1.5 |
73.457 |
8.451 |
vicuna1.5-7b-tree-speculative-decoding-plus |
67.916 |
10.119 |
llava1.5-7b-speculative-decoding |
58.482 |
5.825 |
baichuan-7b-int8-cache |
64.947 |
3.391 |
baichuan-7b |
63.796 |
3.346 |
ViT Inference Time (s) |
Prompt Mode (tok/s) |
Generative Mode ( tok/s) |
|
Qwen2.5 VL 3B |
0.26 |
80.052 |
3.821 |
InternVL3-1B |
2.18 |
59.798 |
4.926 |
Main Time (ms) |
Inference Time (ms) |
|
Stable Diffusion v.1.5 |
32270 |
31016 |
Stable Diffusion v.1.5 controlnet |
42053 |
40368 |
Stable_diffusion_v1_5_controlnet_lora |
42685 |
40568 |
Stable_diffusion_v1.5_2lora |
44973 |
41494 |
Stable Diffusion v2.1 base model with controlnet |
38979 |
37285 |
Stable Diffusion v1.5 LCM Ipadaptor |
13306 |
7326 |
Stable_diffusion_lcm_multiDiffusion |
36379.456 |
35158.57 |
Main Time (ms) |
Inference Time (ms) |
|
CLIP Image Encoder |
||
img_encoder_proj_clip_vit_large_dynamic |
709.513 |
321.735 |
img_encoder_proj_openclip_vit_big_g_dynamic |
15044.4 |
3928.699 |
img_encoder_proj_openclip_vit_h_dynamic |
1800.246 |
1102.059 |
CLIP Text Encoder |
||
text_encoder_clip_vit_large |
568.849 |
48.741 |
text_encoder_openclip_vit_h |
938.379 |
149.713 |