ResNet Models

Overview

ResNet (Residual Networks) introduces a residual learning framework designed to simplify the training of deeper neural networks compared to previously used architectures. Instead of learning a direct mapping, ResNet explicitly reformulates the layers to learn residual functions relative to the layer inputs. This approach has been empirically proven to make these networks easier to optimize and allows for improved accuracy as the network depth increases significantly.

Getting Started

Follow these steps to use and convert ResNet models using PyTorch and TorchVision.

  1. Install Required Libraries:

    Ensure you have the necessary libraries installed:

    pip install torch torchvision
    
  2. Load and Convert ResNet Model:

    Load a pretrained ResNet model using PyTorch and TorchVision, create a dummy input tensor for tracing, trace the model to convert it to TorchScript, and finally save the traced model.

    import torch
    import torchvision
    
    model = torchvision.models.resnet18(pretrained=True)
    trace_data = torch.randn(1, 3, 224, 224)
    trace_model = torch.jit.trace(model.cpu().eval(), trace_data)
    torch.jit.save(trace_model, 'resnet_float.pt')
    

How It Works ?

Before you begin, ensure that the NeuroPilot Converter Tool is installed.

Quant8 Conversion Process

  1. Generate Calibration Data:

    The following script creates a directory named data and generates 100 batches of random input data, each saved as a .npy file. This data is used for calibration during the quantization process.

    import os
    import numpy as np
    
    os.mkdir('data')
    for i in range(100):
        data = np.random.randn(1, 3, 224, 224).astype(np.float32)
        np.save('data/batch_{}.npy'.format(i), data)
    
  2. Convert to Quantized TFLite Format:

    Use the following command to convert the model to a quantized TFLite format using the generated calibration data:

    mtk_pytorch_converter                                 \
        --input_script_module_file=resnet_float.pt        \
        --output_file=resnet_ptq_quant.tflite             \
        --input_shapes=1,3,224,224                        \
        --quantize=True                                   \
        --input_value_ranges=-1,1                         \
        --calibration_data_dir=data/                      \
        --calibration_data_regexp=batch_.*\.npy           \
        --allow_incompatible_paddings_for_tflite_pooling=True
    

FP32 Conversion Process

To convert the model to a non-quantized (FP32) TFLite format, use the following command:

mtk_pytorch_converter                                 \
    --input_script_module_file=resnet_float.pt        \
    --output_file=resnet_float.tflite                 \
    --input_shapes=1,3,224,224                        \
    --allow_incompatible_paddings_for_tflite_pooling=True

Model Details

General Information

Property

Value

Category

Classification

Input Size

224x224

GFLOPS

1.81

#Params (M)

11.68

Training Framework

PyTorch

Inference Framework

TFLite

Quant8 Model package

Download

Float32 Model package

Download

Model Properties

Quant8

  • Format: TensorFlow Lite v3

  • Description: Exported by NeuroPilot converter v7.14.1+release

Inputs

Property

Value

Name

x.2

Tensor

int8[1,3,224,224]

Identifier

23

Quantization

Linear

Quantization Range

-1.0039 ≤ 0.0078 * q ≤ 0.9961

Outputs

Property

Value

Name

383

Tensor

int8[1,1000]

Identifier

62

Quantization

Linear

Quantization Range

-4.4169 ≤ 0.0429 * (q + 25) ≤ 6.5182

Fp32

  • Format: TensorFlow Lite v3

  • Description: Exported by NeuroPilot converter v2.9.0

Inputs

Property

Value

Name

x.2

Tensor

float32[1,3,224,224]

Identifier

80

Outputs

Property

Value

Name

383

Tensor

float32[1,1000]

Identifier

5

Performance Benchmarks

ResNet-quant8

Run model (.tflite) 10 times

CPU (Thread:8)

GPU

ARMNN(GpuAcc)

ARMNN(CpuAcc)

Neuron Stable Delegate(APU)

APU(MDLA)

APU(VPU)

G350

N/A

N/A

N/A

N/A

N/A

N/A

N/A

G510

N/A

N/A

N/A

N/A

N/A

2.79 ms

N/A

G700

N/A

N/A

N/A

N/A

N/A

2.03 ms

N/A

G1200

N/A

N/A

N/A

N/A

N/A

2.05 ms

N/A

ResNet-fp32

Run model (.tflite) 10 times

CPU (Thread:8)

GPU

ARMNN(GpuAcc)

ARMNN(CpuAcc)

Neuron Stable Delegate(APU)

APU(MDLA)

APU(VPU)

G350

255.551 ms (Thread:4)

178.460 ms

147.218 ms

112.225 ms

N/A

N/A

846.724 ms

G510

133.122 ms

55.277 ms

45.856 ms

42.608 ms

8.557 ms

9.21 ms

N/A

G700

61.470 ms

41.428 ms

36.064 ms

34.791 ms

6.226 ms

6.04 ms

N/A

G1200

56.593 msa

30.080 ms

22.070 ms

20.762 ms

7.981 ms

8.04 ms

N/A

  • Widespread: CPU only, light workload.

  • Performance: CPU and GPU, medium workload.

  • Ultimate: CPU, GPU, and APUs, heavy workload.

Resources

github