TFLite(LiteRT) - Generative AI on Genio 720

This page lists the generative AI models and representative performance data for Genio 720 platforms. For background information about generative workloads and usage notes, refer to TFLite(LiteRT) - Generative AI.

Model Support and Performance

The following tables summarize the supported generative models and measured performance on Genio 720.

Large Language Models (LLMs)

Prompt Mode (tok/s)

Generative Mode (tok/s)

DeepSeek-R1-Distill-Llama-8B

36.653

4.578

DeepSeek-R1-Distill-Qwen-1.5B

341.686

11.764

DeepSeek-R1-Distill-Qwen-7B

69.23

4.677

gemma2-2b-it

193.392

8.752

internlm2-chat-1_8b

276.218

17.549

llama3-8b

56.495

4.698

llama3.2-1B-Instruct

401.288

24.533

llama3.2-3B-Instruct

154.557

10.577

Qwen2-0.5B-Instruct

762.455

50.06

Qwen2-1.5B-Instruct

341.993

19.563

Qwen2-7B-Instruct

70.416

4.883

Qwen1.5-1.8B-Chat

310.639

9.895

Qwen2.5-1.5B-Instruct

341.418

18.427

Qwen2.5-3B-Instruct

162.481 ok/s

10.31

Qwen2.5-7B-Instruct

70.548

4.892

Qwen3 1.7B

233.032

10.911

Phi-3-mini-4k-instruct

129.6

7.324

MiniCPM-2B-sft-bf16-llama-format

194.793

7.694

medusa_v1_0_vicuna_7b_v1.5

91.821

10.564

vicuna1.5-7b-tree-speculative-decoding-plus

84.895

12.6489

llava1.5-7b-speculative-decoding

73.103

7.281

baichuan-7b-int8-cache

81.184

4.239

baichuan-7b

79.745

4.182

Vision-Language Models (VLMs)

ViT Inference Time (s)

Prompt Mode (tok/s)

Generative Mode ( tok/s)

Qwen2.5 VL 3B

0.208

100.065

4.776

InternVL3-1B

1.744

74.748

6.157

Stable Diffusion and Image Generation

Main Time (ms)

Inference Time (ms)

Stable Diffusion v.1.5

25816

24813

Stable Diffusion v.1.5 controlnet

33642

32294

Stable_diffusion_v1_5_controlnet_lora

34148

32454

Stable_diffusion_v1.5_2lora

35978

33195

Stable Diffusion v2.1 base model with controlnet

31183

29828

Stable Diffusion v1.5 LCM Ipadaptor

10645

5861

Stable_diffusion_lcm_multiDiffusion

29103.565

28126.856

CLIP and Embedding Models

Main Time (ms)

Inference Time (ms)

CLIP Image Encoder

img_encoder_proj_clip_vit_large_dynamic

567.61

257.388

img_encoder_proj_openclip_vit_big_g_dynamic

12035.52

3142.959

img_encoder_proj_openclip_vit_h_dynamic

1440.197

881.647

CLIP Text Encoder

text_encoder_clip_vit_large

455.079

38.993

text_encoder_openclip_vit_h

750.703

119.77