============================== TFLite(LiteRT) - Generative AI ============================== The Generative Model section provides performance and capability data for large language models (LLMs), vision-language models (VLMs) and image generation models on MediaTek Genio platforms. This section is intended as a **reference for benchmarking and platform capability validation**, not as a distribution channel for full training or deployment assets. .. image:: /_asset/analytical-ai-gai-workflow.png :alt: Analytical AI inference paths: ONNX Runtime :align: center :width: 100% .. raw:: html
.. note:: For Generative AI workloads, this section provides **performance data and capability information only**. Access to the full Generative AI deployment toolkit (GAI toolkit) requires a non-disclosure agreement (NDA) with MediaTek. After signing an NDA, the toolkit can be downloaded from `NeuroPilot Document `_. Model Categories ================ The generative models in this section are grouped into the following categories: * **Large Language Models (LLMs)** – Text-only models for tasks such as dialogue, summarization, and code generation. * **Vision-Language Models (VLMs)** – Multimodal models that process both images and text (for example, image captioning or visual question answering). * **Image Generation and Enhancement** – Models such as Stable Diffusion and other diffusion or transformer-based pipelines used for image synthesis, editing, or super-resolution. * **Embedding and Encoder Models** – Models like CLIP encoders for computing joint image-text embeddings for retrieval or ranking tasks. Supported Models on Genio Products ================================== Platform-specific model lists and performance data are provided in the following pages: .. toctree:: :maxdepth: 1 Genio 720 Performance Genio 520 Performance MT8893 Performance Performance Notes and Limitations --------------------------------- For Generative AI workloads, measured performance on **Genio 520** may be slightly lower than on **Genio 720**. This gap is primarily due to **DRAM bandwidth differences** between the two platforms and might affect: * Token generation speed for LLMs. * End-to-end latency for diffusion-based image generation. * Multimodal pipelines that exchange large intermediate tensors between subsystems. The following comparative data is provided for reference. .. important:: The tables in this section provide **representative** numbers only. To obtain the most accurate performance for a specific use case, developers must deploy and run the workload directly on the target platform under the intended system configuration. .. dropdown:: LLM Performance Comparison :icon: code-review .. csv-table:: Prompt Mode Comparison (Unit: tok/s) :file: /_asset/tables/LLM_PromptMode_Comparison.csv :header-rows: 1 .. csv-table:: Generative Mode Comparison (Unit: tok/s) :file: /_asset/tables/LLM_GenerativeMode_Comparison.csv :header-rows: 1 .. dropdown:: VLM Performance Comparison :icon: eye .. csv-table:: ViT Inference Time (Unit: s) :file: /_asset/tables/VLM_ViT_InferenceTime_Comparison.csv :header-rows: 1 .. csv-table:: Prompt Mode (Unit: tok/s) :file: /_asset/tables/VLM_PromptMode_Comparison.csv :header-rows: 1 .. csv-table:: Generative Mode (Unit: tok/s) :file: /_asset/tables/VLM_GenerativeMode_Comparison.csv :header-rows: 1 .. dropdown:: Stable Diffusion Performance Comparison :icon: image .. csv-table:: Main Time Comparison (Unit: ms) :file: /_asset/tables/SD_MainTime_Comparison.csv :header-rows: 1 .. csv-table:: Inference Time Comparison (Unit: ms) :file: /_asset/tables/SD_InferenceTime_Comparison.csv :header-rows: 1 .. dropdown:: CLIP Performance Comparison :icon: image .. csv-table:: Main Time Comparison (Unit: ms) :file: /_asset/tables/CLIP_MainTime_Comparison.csv :header-rows: 1 .. csv-table:: Inference Time Comparison (Unit: ms) :file: /_asset/tables/CLIP_InferenceTime_Comparison.csv :header-rows: 1 Deployment and Source Models ============================ The generative models referenced in this section are primarily intended for **benchmarking and capability validation**. * Model accuracy and qualitative output quality are **not addressed**. * MediaTek does not redistribute original training datasets or checkpoint files for third-party or open-source models. * For production deployment, developers must: * Obtain the original models from their official sources, * Follow the applicable licenses and usage terms, and * Perform any required fine-tuning or post-training optimization for your application.