TFLite(LiteRT) - Generative AI on Genio 720

This page lists the generative AI models and representative performance data for Genio 720 platforms. For background information about generative workloads and usage notes, refer to TFLite(LiteRT) - Generative AI.

Model Support and Performance

The following tables summarize the supported generative models and measured performance on Genio 720.

Large Language Models (LLMs)

Model

Prompt Mode (tok/s)

Generative Mode (tok/s)

DeepSeek-R1-Distill-Llama-8B

36.65

4.58

DeepSeek-R1-Distill-Qwen-1.5B

341.69

11.76

DeepSeek-R1-Distill-Qwen-7B

69.23

4.68

gemma2-2b-it

193.39

8.75

internlm2-chat-1_8b

276.22

17.55

llama3-8b

56.50

4.70

llama3.2-1B-Instruct

401.29

24.53

llama3.2-3B-Instruct

154.56

10.58

Qwen2-0.5B-Instruct

762.46

50.06

Qwen2-1.5B-Instruct

341.99

19.56

Qwen2-7B-Instruct

70.42

4.88

Qwen1.5-1.8B-Chat

310.64

9.90

Qwen2.5-1.5B-Instruct

341.42

18.43

Qwen2.5-3B-Instruct

162.48

10.31

Qwen2.5-7B-Instruct

70.55

4.89

Qwen3 1.7B

233.03

10.91

Phi-3-mini-4k-instruct

129.60

7.32

MiniCPM-2B-sft-bf16-llama-format

194.79

7.69

medusa_v1_0_vicuna_7b_v1.5

91.82

10.56

vicuna1.5-7b-tree-speculative-decoding-plus

84.90

12.65

llava1.5-7b-speculative-decoding

73.10

7.28

baichuan-7b-int8-cache

81.18

4.24

baichuan-7b

79.75

4.18

Vision-Language Models (VLMs)

Model

ViT Inference Time (s)

Prompt Mode (tok/s)

Generative Mode ( tok/s)

Qwen2.5 VL 3B

0.21

100.07

4.78

InternVL3-1B

1.74

74.75

6.16

Stable Diffusion and Image Generation

Model

Main Time (ms)

Inference Time (ms)

Stable Diffusion v.1.5

25816

24813

Stable Diffusion v.1.5 controlnet

33642

32294

Stable_diffusion_v1_5_controlnet_lora

34148

32454

Stable_diffusion_v1.5_2lora

35978

33195

Stable Diffusion v2.1 base model with controlnet

31183

29828

Stable Diffusion v1.5 LCM Ipadaptor

10645

5861

Stable_diffusion_lcm_multiDiffusion

29104

28127

CLIP and Embedding Models

Model

Main Time (ms)

Inference Time (ms)

CLIP Image Encoder

img_encoder_proj_clip_vit_large_dynamic

567.61

257.39

img_encoder_proj_openclip_vit_big_g_dynamic

12035.52

3142.96

img_encoder_proj_openclip_vit_h_dynamic

1440.20

881.65

CLIP Text Encoder

text_encoder_clip_vit_large

455.08

38.99

text_encoder_openclip_vit_h

750.70

119.77