.. include:: /keyword.rst

=================
ONNX Runtime Demo
=================

The following examples demonstrate how to execute image classification tasks using the ``label_image.py`` script on Genio platforms.

CPU-Based Inference
===================

The following command executes an EfficientNet-Lite4 ONNX model on a Genio platform using the CPU.

.. prompt:: bash # auto

   # python3 label_image.py -i kitten.jpg -l labels.txt -m model.onnx --execution_provider CPUExecutionProvider

**Example Output:**

.. code-block:: text

   0.88291174  281: 'tabby cat'
   0.09353886  285: 'Egyptian cat'
   Time: 103.599ms

NPU-Accelerated Inference
=========================

The developer can offload inference to the NPU by specifying the ``NeuronExecutionProvider`` and providing the mandatory hardware flags.

.. prompt:: bash # auto

   # python3 label_image.py -i kitten.jpg -l labels.txt -m model.onnx \
       --execution_provider NeuronExecutionProvider \
       --neuron_flag_use_fp16 1 \
       --neuron_flag_min_group_size 1

**Example Output:**

.. code-block:: text

   0.8833008  281: 'tabby cat'
   0.0930786  285: 'Egyptian cat'
   Time: 13.100ms

.. note::

   As shown in the examples, NPU acceleration significantly reduces inference time (from ~103ms to ~13ms) compared to CPU execution on Genio platforms.