VGGFace Models

Overview

VGGFace is a deep convolutional neural network model designed for face recognition tasks. Based on the VGG architecture, it uses a deep structure with small convolutional filters to capture detailed facial features. Trained on a large dataset of celebrity faces, VGGFace excels in face recognition, verification, and feature extraction, making it a widely used model in both research and industry due to its high accuracy and versatility.

Getting Started

Follow these steps to set up and convert the VGGFace model using PyTorch.

Clone the VGGFace PyTorch Repository:

git clone https://github.com/prlz77/vgg-face.pytorch.git

Download and Extract the Pretrained VGGFace Model:

Download the pretrained VGGFace model from the following link:
```
wget https://www.robots.ox.ac.uk/~vgg/software/vgg_face/src/vgg_face_torch.tar.gz
```
Extract the downloaded tar file:
```
tar zxvf vgg_face_torch.tar.gz
```
Move the extracted files to the pretrained directory in the cloned repository:
```
mv vgg_face_torch/* vgg-face.pytorch/pretrained
```

Modify the VGGFace Model Script:

Open the model script in a text editor:

gedit models/vgg_face.py

Add the following lines to trace the model and save it as a TorchScript file:

traced_model = torch.jit.trace(model, im)
traced_model.save("vggface_traced_model.pt")
print("Traced model saved successfully.")

How It Works ?

Before you begin, ensure that the NeuroPilot Converter Tool is installed.

Quant8 Conversion Process

Generate Calibration Data:

The following script creates a directory named data and generates 100 batches of random input data, each saved as a .npy file. This data is used for calibration during the quantization process.
```
import os
import numpy as np

os.mkdir('data')
for i in range(100):
    data = np.random.randn(1, 3, 224, 224).astype(np.float32)
    np.save('data/batch_{}.npy'.format(i), data)
```

Convert to Quantized TFLite Format:

Use the following command to convert the model to a quantized TFLite format using the generated calibration data:

mtk_pytorch_converter                                 \
    --input_script_module_file=vggface_traced_model.pt    \
    --output_file=vggface_traced_model_ptq_quant.tflite   \
    --input_shapes=1,3,224,224                            \
    --quantize=True                                       \
    --input_value_ranges=-1,1                             \
    --calibration_data_dir=data/                          \
    --calibration_data_regexp=batch_.*\.npy

FP32 Conversion Process

To convert the model to a non-quantized (FP32) TFLite format, use the following command:

mtk_pytorch_converter                                 \
    --input_script_module_file=vggface_traced_model.pt \
    --output_file=vggface_traced_model.tflite          \
    --input_shapes=1,3,224,224

Model Details

General Information

Property	Value
Category	Recognition
Input Size	224x224
#MACs (G)	None
#Params (M)	None
Training Framework	PyTorch
Inference Framework	TFLite
Model package	Download

Model Properties

Quant8

Format: TensorFlow Lite v3
Description: Exported by NeuroPilot converter v7.14.1+release

Inputs

Property	Value
Name	x.1
Tensor	int8[1,3,224,224]
Identifier	10
Quantization	Linear
Quantization Range	-1.0039 ≤ 0.0078 * q ≤ 0.9961

Outputs

Property	Value
Name	238
Tensor	int8[1,2622]
Identifier	52
Quantization	Linear
Quantization Range	-0.0163 ≤ 0.0002 * (q + 30) ≤ 0.0261

Fp32

Format: TensorFlow Lite v3
Description: Exported by NeuroPilot converter v7.14.1+release

Inputs

Property	Value
Name	x.1
Tensor	float32[1,3,224,224]
Identifier	16

Outputs

Property	Value
Name	238
Tensor	float32[1,2622]
Identifier	46

Performance Benchmarks

VGGFace-quant8

Run model (.tflite) 10 times	CPU (Thread:8)	GPU	ARMNN(GpuAcc)	ARMNN(CpuAcc)	Neuron Stable Delegate(APU)	APU(MDLA)	APU(VPU)
G350	N/A	N/A	N/A	N/A	N/A	N/A	N/A
G510	N/A	N/A	N/A	N/A	N/A	25.04 ms	N/A
G700	N/A	N/A	N/A	N/A	N/A	17.04 ms	N/A
G1200	N/A	N/A	N/A	N/A	N/A	24.04 ms	N/A

VGGFace-fp32

Run model (.tflite) 10 times	CPU (Thread:8)	GPU	ARMNN(GpuAcc)	ARMNN(CpuAcc)	Neuron Stable Delegate(APU)	APU(MDLA)	APU(VPU)
G350	2208.66 ms (Thread:4)	853.617 ms	N/A	N/A	N/A	N/A	6489.54 ms
G510	674.378 ms	295.808 ms	222.366 ms	236.165 ms	81.971 ms	81.7 ms	N/A
G700	457.396 ms	232.558 ms	168.644 ms	193.350 ms	56.405 ms	56.04 ms	N/A
G1200	385.180 ms	133.887 ms	97.913 ms	113.238 ms	49.548 ms	49.05 ms	N/A

Widespread: CPU only, light workload.
Performance: CPU and GPU, medium workload.
Ultimate: CPU, GPU, and APUs, heavy workload.

Resources

github