Neuron Run-Time API

Neuron Runtime API provides a set of API functions that users can invoke from within a C/C++ program to create a run-time environment, parse a compiled model file, and perform on-device network inference.

NeuroPilot 6 includes two versions of Neuron Runtime API.

Runtime API Versions

Neuron Runtime V1

  • For single task and sequential execution (synchronous inference).

  • Inference API function: NeuronRuntime_inference (synchronous function call).

  • Use Neuron Runtime V1 if there is no time overlap between each inference.

Neuron Runtime V2

  • For multi-task execution in parallel (asynchronous inference).

  • Inference API function: NeuronRuntimeV2_enqueue (asynchronous function call).

  • Use Neuron Runtime V2 if the next inference will start before the previous inference has finished.

  • Runtime V2 might increase power consumption, because parallel execution uses more hardware resources.

  • Runtime V2 might increase memory footprint, because each parallel task maintains its own working buffer.