InceptionV3 Models
Overview
Inception V3, a convolutional neural network (CNN) from Google’s Inception family, is designed for deep networks with fewer than 25 million parameters. It excels in image analysis and object detection, with applications ranging from computer vision to life sciences like leukemia research. Often, it’s used pre-trained on ImageNet
Getting Started
Follow these steps to use and convert Inception v3 models using PyTorch and TorchVision.
Install Required Libraries:
Ensure you have the necessary libraries installed:
pip install torch torchvision
Load and Convert Inception v3 Model:
Load a pretrained Inception v3 model using PyTorch and TorchVision, create a dummy input tensor for tracing, trace the model to convert it to TorchScript, and finally save the traced model.
import torch import torchvision model = torchvision.models.inception_v3(pretrained=True) trace_data = torch.randn(1, 3, 224, 224) trace_model = torch.jit.trace(model.cpu().eval(), trace_data) torch.jit.save(trace_model, 'inception_v3.pt')
How It Works ?
Before you begin, ensure that the NeuroPilot Converter Tool is installed.
Quant8 Conversion Process
Generate Calibration Data:
The following script creates a directory named data and generates 100 batches of random input data, each saved as a .npy file. This data is used for calibration during the quantization process.
import os import numpy as np os.mkdir('data') for i in range(100): data = np.random.randn(1, 3, 224, 224).astype(np.float32) np.save('data/batch_{}.npy'.format(i), data)
Convert to Quantized TFLite Format:
Use the following command to convert the model to a quantized TFLite format using the generated calibration data:
mtk_pytorch_converter \ --input_script_module_file=Inception_v3.pt \ --output_file=Inception_v3_ptq_quant.tflite \ --input_shapes=1,3,224,224 \ --quantize=True \ --input_value_ranges=-1,1 \ --calibration_data_dir=data/ \ --calibration_data_regexp=batch_.*\.npy
FP32 Conversion Process
To convert the model to a non-quantized (FP32) TFLite format, use the following command:
mtk_pytorch_converter \
--input_script_module_file=Inception_v3.pt \
--output_file=Inception_v3.tflite \
--input_shapes=1,3,224,224
Model Details
General Information
Property |
Value |
---|---|
Category |
Classification |
Input Size |
224x224 |
GFLOPS |
1.50 |
#Params (M) |
6.62 |
Training Framework |
PyTorch |
Inference Framework |
TFLite |
Quant8 Model package |
|
Float32 Model package |
Model Properties
Format: TensorFlow Lite v3
Description: Exported by NeuroPilot converter v7.14.1+release
Quant8
Inputs
Property |
Value |
---|---|
Name |
x.2 |
Tensor |
int8[1,3,224,224] |
Identifier |
154 |
Quantization |
Linear |
Quantization Range |
-1.0039 ≤ 0.0078 * q ≤ 0.9961 |
Outputs
Property |
Value |
---|---|
Name |
1707 |
Tensor |
int8[1,1000] |
Identifier |
73 |
Quantization |
Linear |
Quantization Range |
-2.0561 ≤ 0.0196 * (q + 23) ≤ 2.9372 |
Fp32
Inputs
Property |
Value |
---|---|
Name |
x.2 |
Tensor |
float32[1,3,224,224] |
Identifier |
157 |
Outputs
Property |
Value |
---|---|
Name |
1707 |
Tensor |
float32[1,1000] |
Identifier |
8 |
Performance Benchmarks
InceptionV3-quant8
Run model (.tflite) 10 times |
CPU (Thread:8) |
GPU |
ARMNN(GpuAcc) |
ARMNN(CpuAcc) |
Neuron Stable Delegate(APU) |
APU(MDLA) |
APU(VPU) |
G350 |
N/A |
N/A |
N/A |
N/A |
N/A |
N/A |
N/A |
G510 |
N/A |
N/A |
N/A |
N/A |
N/A |
5.59 ms |
N/A |
G700 |
N/A |
N/A |
N/A |
N/A |
N/A |
3.04 ms |
N/A |
G1200 |
N/A |
N/A |
N/A |
N/A |
N/A |
5.04 ms |
N/A |
InceptionV3-fp32
Run model (.tflite) 10 times |
CPU (Thread:8) |
GPU |
ARMNN(GpuAcc) |
ARMNN(CpuAcc) |
Neuron Stable Delegate(APU) |
APU(MDLA) |
APU(VPU) |
G350 |
405.697 ms (Thread:4) |
325.574 ms |
275.707 ms |
N/A |
N/A |
N/A |
1399.8 ms |
G510 |
319.684 ms |
112.367 ms |
98.235 ms |
80.029 ms |
17.199 ms |
17.68 ms |
N/A |
G700 |
76.034 ms |
83.642 ms |
70.822 ms |
68.439 ms |
12.263 ms |
12.04 ms |
N/A |
G1200 |
70.697 ms |
68.656 ms |
49.117 ms |
41.429 ms |
11.757 ms |
11.04 ms |
N/A |
Widespread: CPU only, light workload.
Performance: CPU and GPU, medium workload.
Ultimate: CPU, GPU, and APUs, heavy workload.