============================ ONNX Runtime - Analytical AI ============================ .. toctree:: :hidden: Onnx Development Accelerating Onnxruntime on NPU Onnxruntime Demo TAO on Genio .. include:: /keyword.rst `ONNX Runtime `_ is a high-performance, cross-platform engine for running and training machine learning models in the **Open Neural Network Exchange (ONNX)** format. It accelerates inference and training for models from popular frameworks such as PyTorch and TensorFlow, leveraging hardware accelerators and graph optimizations for optimal performance. ONNX Runtime on Genio --------------------- Genio platforms execute ONNX models efficiently through pre-integrated support. The current supported version on |IOT-YOCTO| is **v1.20.2**. Starting from Rity v25.1, the Board Support Package (BSP) provides prebuilt ONNX Runtime binaries. .. note:: The ``rity-demo-image`` includes prebuilt ONNX Runtime packages by default. ONNX Runtime Workflow on Yocto ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The following figure illustrates the analytical AI workflow for ONNX Runtime on Genio Yocto platforms. It shows the path from the ONNX model to hardware execution through the ONNX Runtime and its associated execution providers. .. image:: /_asset/analytical-ai-onnx-workflow.png :alt: ONNX Runtime analytical AI workflow on Genio Yocto :align: center :width: 800px .. raw:: html
To add ONNX Runtime to a custom Rity image, the developer must perform the following steps: 1. **Initialize the repository:** .. code-block:: shell repo init -u https://gitlab.com/mediatek/aiot/bsp/manifest.git -b refs/tags/rity-scarthgap-v25.1 2. **Synchronize the repository:** .. code-block:: shell repo sync -j 12 3. **Modify the configuration:** Add the following line to the ``local.conf`` file: .. code-block:: shell IMAGE_INSTALL:append = " onnxruntime-prebuilt " 4. **Build the image:** .. code-block:: shell bitbake rity-demo-image Basic Inference on Genio ------------------------ Once ONNX Runtime is integrated, the developer can execute models using the Python API. The following script provides a template for benchmarking ONNX models on the CPU using the ``XnnpackExecutionProvider``. .. code-block:: python import onnxruntime as ort import numpy as np import time def load_model(model_path): session_options = ort.SessionOptions() session_options.intra_op_num_threads = 4 session_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL # Use XNNPACK for optimized CPU execution execution_providers = ['XnnpackExecutionProvider', 'CPUExecutionProvider'] return ort.InferenceSession(model_path, sess_options=session_options, providers=execution_providers) def benchmark_model(model_path, num_iterations=100): session = load_model(model_path) input_meta = session.get_inputs()[0] input_data = {input_meta.name: np.random.random(input_meta.shape).astype(np.float32)} total_time = 0.0 for _ in range(num_iterations): start_time = time.time() session.run(None, input_data) total_time += (time.time() - start_time) print(f"Average inference time: {total_time / num_iterations:.6f} seconds") if __name__ == "__main__": benchmark_model("your_model.onnx")