TFLite(LiteRT) - Generative AI on Genio 520

This page lists the generative AI models and representative performance data for Genio 520 platforms. For background information about generative workloads and usage notes, refer to TFlite(LiteRT) - Generative AI.

Model Support and Performance

The following tables summarize the supported generative models and measured performance on Genio 520.

Large Language Models (LLMs)

Model

Prompt Mode (tok/s)

Generative Mode (tok/s)

DeepSeek-R1-Distill-Llama-8B

Not Support

Not Support

DeepSeek-R1-Distill-Qwen-1.5B

276.82

9.18

DeepSeek-R1-Distill-Qwen-7B

Not Support

Not Support

gemma2-2b-it

Not Support

Not Support

internlm2-chat-1.8b

243.61

14.48

llama3-8b

Not Support

Not Support

llama3.2-1B-Instruct

335.92

20.59

llama3.2-3B-Instruct

Not Support

Not Support

Qwen2-0.5B-Instruct

641.70

42.43

Qwen2-1.5B-Instruct

274.50

15.32

Qwen2-7B-Instruct

Not Support

Not Support

Qwen1.5-1.8B-Chat

268.23

8.63

Qwen2.5-1.5B-Instruct

269.87

15.11

Qwen2.5-3B-Instruct

Not Support

Not Support

Qwen2.5-7B-Instruct

Not Support

Not Support

Qwen3 1.7B

229.63

10.53

Phi-3-mini-4k-instruct

Not Support

Not Support

MiniCPM-2B-sft-bf16-llama-format

153.14

6.48

medusa_v1_0_vicuna_7b_v1.5

Not Support

Not Support

vicuna1.5-7b-tree-speculative-decoding-plus

Not Support

Not Support

llava1.5-7b-speculative-decoding

Not Support

Not Support

baichuan-7b-int8-cache

Not Support

Not Support

baichuan-7b

Not Support

Not Support

Vision-Language Models (VLMs)

Model

ViT Inference Time (s)

Prompt Mode (tok/s)

Generative Mode ( tok/s)

Qwen2.5 VL 3B

Not Support

Not Support

Not Support

InternVL3-1B

1.94

65.23

4.52

Stable Diffusion and Image Generation

Model

Main Time (ms)

Inference Time (ms)

Stable Diffusion v.1.5

33754

29001

Stable Diffusion v.1.5 controlnet

47923

37581

Stable_diffusion_v1_5_controlnet_lora

Not Support

Not Support

Stable_diffusion_v1.5_2lora

45619

39198

Stable Diffusion v2.1 base model with controlnet

Not Support

Not Support

Stable Diffusion v1.5 LCM Ipadaptor

11378

4978

Stable_diffusion_lcm_multiDiffusion

34731

32585

CLIP and Embedding Models

Model

Main Time (ms)

Inference Time (ms)

CLIP Image Encoder

img_encoder_proj_clip_vit_large_dynamic

662.22

296.84

img_encoder_proj_openclip_vit_big_g_dynamic

Not Support

Not Support

img_encoder_proj_openclip_vit_h_dynamic

Not Support

Not Support

CLIP Text Encoder

text_encoder_clip_vit_large

748.45

45.56

text_encoder_openclip_vit_h

Not Support

Not Support