====================== Performance Benchmarks ====================== .. raw:: html This page provides a centralized reference for evaluating AI inference performance across MediaTek Genio platforms, which aggregates benchmarking results for various AI workloads, including analytical AI and generative AI, across multiple inference frameworks such as TFLite (LiteRT) and ONNX Runtime. .. important:: For more platform-specific details and comprehensive performance data, please refer to the :doc:`Model Zoo `. AI Supporting Scope =================== The following table summarizes the AI capabilities and framework support across different MediaTek Genio platforms. .. csv-table:: AI Supporting Scope Across Different Platform :file: /_asset/tables/ml-platform-soc-ai-framework.csv :width: 100% TFLite(LiteRT) - Analytical AI ============================== The following tables list the validated TFLite analytical models and their performance across Genio platforms. The statistics were measured using **offline inference** with **performance mode enabled**. .. csv-table:: Models for Detection :class: hide-last-column longtable :file: /_asset/tables/ml-model-hub-detection.csv :width: 100% .. csv-table:: Models for Classification :class: hide-last-column longtable :file: /_asset/tables/ml-model-hub-classification.csv :width: 100% .. csv-table:: Models for Recognition :class: hide-last-column longtable :file: /_asset/tables/ml-model-hub-recognition.csv :width: 100% TFLite(LiteRT) - Generative AI ============================== For Generative AI workloads, the following tables provide representative performance data for reference and platform capability validation. LLM Performance Comparison -------------------------- .. csv-table:: Prompt Mode Comparison (Unit: tok/s) :file: /_asset/tables/LLM_PromptMode_Comparison.csv :header-rows: 1 .. csv-table:: Generative Mode Comparison (Unit: tok/s) :file: /_asset/tables/LLM_GenerativeMode_Comparison.csv :header-rows: 1 VLM Performance Comparison -------------------------- .. csv-table:: ViT Inference Time (Unit: s) :file: /_asset/tables/VLM_ViT_InferenceTime_Comparison.csv :header-rows: 1 .. csv-table:: Prompt Mode (Unit: tok/s) :file: /_asset/tables/VLM_PromptMode_Comparison.csv :header-rows: 1 .. csv-table:: Generative Mode (Unit: tok/s) :file: /_asset/tables/VLM_GenerativeMode_Comparison.csv :header-rows: 1 Stable Diffusion Performance Comparison --------------------------------------- .. csv-table:: Main Time Comparison (Unit: ms) :file: /_asset/tables/SD_MainTime_Comparison.csv :header-rows: 1 .. csv-table:: Inference Time Comparison (Unit: ms) :file: /_asset/tables/SD_InferenceTime_Comparison.csv :header-rows: 1 CLIP Performance Comparison --------------------------- .. csv-table:: Main Time Comparison (Unit: ms) :file: /_asset/tables/CLIP_MainTime_Comparison.csv :header-rows: 1 .. csv-table:: Inference Time Comparison (Unit: ms) :file: /_asset/tables/CLIP_InferenceTime_Comparison.csv :header-rows: 1 ONNX Runtime - Analytical AI ============================ The following tables list ONNX models validated on Genio platforms. Measurements were obtained using the **NPU Execution Provider** (where available) with **performance mode enabled**. .. csv-table:: Models for TAO Related :class: hide-last-column longtable :file: /_asset/tables/ml-model-hub-onnx-tao.csv :width: 100% .. csv-table:: Models for Detection :class: hide-last-column longtable :file: /_asset/tables/ml-model-hub-onnx-detection.csv :width: 100% .. csv-table:: Models for Classification :class: hide-last-column longtable :file: /_asset/tables/ml-model-hub-onnx-classification.csv :width: 100% .. csv-table:: Models for Recognition :class: hide-last-column longtable :file: /_asset/tables/ml-model-hub-onnx-recognition.csv :width: 100% .. csv-table:: Models for Robotic :class: hide-last-column longtable :file: /_asset/tables/ml-model-hub-onnx-robotic.csv :width: 100% Performance Notes ================= Performance can vary depending on: * The specific Genio platform and hardware configuration. * The version of the board image and evaluation kit (EVK). * The selected backend and model variant. To obtain the most accurate performance numbers for your use case, you must run the application directly on the target platform.