ONNX Runtime Demo
The following examples demonstrate how to execute image classification tasks using the label_image.py script on Genio platforms.
CPU-Based Inference
The following command executes an EfficientNet-Lite4 ONNX model on a Genio platform using the CPU.
python3 label_image.py -i kitten.jpg -l labels.txt -m model.onnx --execution_provider CPUExecutionProvider
Example Output:
0.88291174 281: 'tabby cat'
0.09353886 285: 'Egyptian cat'
Time: 103.599ms
NPU-Accelerated Inference
The developer can offload inference to the NPU by specifying the NeuronExecutionProvider and providing the mandatory hardware flags.
python3 label_image.py -i kitten.jpg -l labels.txt -m model.onnx \
--execution_provider NeuronExecutionProvider \
--neuron_flag_use_fp16 1 \
--neuron_flag_min_group_size 1
Example Output:
0.8833008 281: 'tabby cat'
0.0930786 285: 'Egyptian cat'
Time: 13.100ms
Note
As shown in the examples, NPU acceleration significantly reduces inference time (from ~103ms to ~13ms) compared to CPU execution on Genio platforms.