==============================
TFLite(LiteRT) - Generative AI
==============================
The Generative Model section provides performance and capability data for large language models (LLMs), vision-language models (VLMs) and image generation models on MediaTek Genio platforms.
This section is intended as a **reference for benchmarking and platform capability validation**, not as a distribution channel for full training or deployment assets.
.. image:: /_asset/analytical-ai-gai-workflow.png
:alt: Analytical AI inference paths: ONNX Runtime
:align: center
:width: 100%
.. raw:: html
.. note::
For Generative AI workloads, this section provides **performance data and capability information only**.
Access to the full Generative AI deployment toolkit (GAI toolkit) requires a non-disclosure agreement (NDA) with MediaTek.
After signing an NDA, the toolkit can be downloaded from `NeuroPilot Document `_.
Model Categories
================
The generative models in this section are grouped into the following categories:
* **Large Language Models (LLMs)** – Text-only models for tasks such as dialogue, summarization, and code generation.
* **Vision-Language Models (VLMs)** – Multimodal models that process both images and text (for example, image captioning or visual question answering).
* **Image Generation and Enhancement** – Models such as Stable Diffusion and other diffusion or transformer-based pipelines used for image synthesis, editing, or super-resolution.
* **Embedding and Encoder Models** – Models like CLIP encoders for computing joint image-text embeddings for retrieval or ranking tasks.
Supported Models on Genio Products
==================================
Platform-specific model lists and performance data are provided in the following pages:
.. toctree::
:maxdepth: 1
Genio 720 Performance
Genio 520 Performance
MT8893 Performance
Performance Notes and Limitations
---------------------------------
For Generative AI workloads, measured performance on **Genio 520** may be slightly lower than on **Genio 720**.
This gap is primarily due to **DRAM bandwidth differences** between the two platforms and might affect:
* Token generation speed for LLMs.
* End-to-end latency for diffusion-based image generation.
* Multimodal pipelines that exchange large intermediate tensors between subsystems.
The following comparative data is provided for reference.
.. important::
The tables in this section provide **representative** numbers only.
To obtain the most accurate performance for a specific use case, developers must deploy and run the workload directly on the target platform under the intended system configuration.
.. dropdown:: LLM Performance Comparison
:icon: code-review
.. csv-table:: Prompt Mode Comparison (Unit: tok/s)
:file: /_asset/tables/LLM_PromptMode_Comparison.csv
:header-rows: 1
.. csv-table:: Generative Mode Comparison (Unit: tok/s)
:file: /_asset/tables/LLM_GenerativeMode_Comparison.csv
:header-rows: 1
.. dropdown:: VLM Performance Comparison
:icon: eye
.. csv-table:: ViT Inference Time (Unit: s)
:file: /_asset/tables/VLM_ViT_InferenceTime_Comparison.csv
:header-rows: 1
.. csv-table:: Prompt Mode (Unit: tok/s)
:file: /_asset/tables/VLM_PromptMode_Comparison.csv
:header-rows: 1
.. csv-table:: Generative Mode (Unit: tok/s)
:file: /_asset/tables/VLM_GenerativeMode_Comparison.csv
:header-rows: 1
.. dropdown:: Stable Diffusion Performance Comparison
:icon: image
.. csv-table:: Main Time Comparison (Unit: ms)
:file: /_asset/tables/SD_MainTime_Comparison.csv
:header-rows: 1
.. csv-table:: Inference Time Comparison (Unit: ms)
:file: /_asset/tables/SD_InferenceTime_Comparison.csv
:header-rows: 1
.. dropdown:: CLIP Performance Comparison
:icon: image
.. csv-table:: Main Time Comparison (Unit: ms)
:file: /_asset/tables/CLIP_MainTime_Comparison.csv
:header-rows: 1
.. csv-table:: Inference Time Comparison (Unit: ms)
:file: /_asset/tables/CLIP_InferenceTime_Comparison.csv
:header-rows: 1
Deployment and Source Models
============================
The generative models referenced in this section are primarily intended for **benchmarking and capability validation**.
* Model accuracy and qualitative output quality are **not addressed**.
* MediaTek does not redistribute original training datasets or checkpoint files for third-party or open-source models.
* For production deployment, developers must:
* Obtain the original models from their official sources,
* Follow the applicable licenses and usage terms, and
* Perform any required fine-tuning or post-training optimization for your application.