ONNX Runtime Demo

The following examples demonstrate how to execute image classification tasks using the label_image.py script on Genio platforms.

CPU-Based Inference

The following command executes an EfficientNet-Lite4 ONNX model on a Genio platform using the CPU.

python3 label_image.py -i kitten.jpg -l labels.txt -m model.onnx --execution_provider CPUExecutionProvider

Example Output:

0.88291174  281: 'tabby cat'
0.09353886  285: 'Egyptian cat'
Time: 103.599ms

NPU-Accelerated Inference

The developer can offload inference to the NPU by specifying the NeuronExecutionProvider and providing the mandatory hardware flags.

python3 label_image.py -i kitten.jpg -l labels.txt -m model.onnx \
    --execution_provider NeuronExecutionProvider \
    --neuron_flag_use_fp16 1 \
    --neuron_flag_min_group_size 1

Example Output:

0.8833008  281: 'tabby cat'
0.0930786  285: 'Egyptian cat'
Time: 13.100ms

Note

As shown in the examples, NPU acceleration significantly reduces inference time (from ~103ms to ~13ms) compared to CPU execution on Genio platforms.