YOLOX_s Models

Overview

YOLOX is an anchor-free evolution of the YOLO model, designed to offer a streamlined architecture while delivering enhanced performance. It aims to bridge advancements in research with practical applications in the industry.

Getting Started

Follow these steps to set up, download, and convert the YOLOX-S model using PyTorch.

Clone the YOLOX Repository:

Clone the YOLOX repository from GitHub:
```
git clone https://github.com/Megvii-BaseDetection/YOLOX.git
```

Download the YOLOX-S PyTorch Model:

Download the pretrained YOLOX-S model from the following command:

cd YOLOX
wget https://github.com/Megvii-BaseDetection/YOLOX/releases/download/0.1.1rc0/yolox_s.pth

Export the YOLOX-S Model to TorchScript:

Run the export script to convert the YOLOX-S model to TorchScript format:
```
python3.8 tools/export_torchscript.py -n yolox_s -c yolox_s.pth
```

How it Works ?

Before you begin, ensure that the NeuroPilot Converter Tool is installed.

Quant8 Conversion Process

The following script demonstrates how to convert the YOLOX model to a quantized TFLite format using the NeuroPilot Converter Tool:

Data Generation: A generator function creates random input data, which is used for calibration during the quantization process.
Model Loading: The YOLOX model is loaded from a TorchScript file.
Quantization: The model is set up for quantization, specifying input value ranges and using the generated calibration data.
Conversion: The quantized model is converted to TFLite format and saved as yolox_s_quant.tflite.

import mtk_converter
import numpy as np

def data_gen():
    for i in range(100):
        yield [np.random.randn(1, 3, 640, 640).astype(np.float32)]

converter = mtk_converter.PyTorchConverter.from_script_module_file(
    'yolox.torchscript.pt',  [[1, 3, 640, 640]],
)
converter.quantize = True
converter.input_value_ranges = [(-1.0, 1.0)]
converter.calibration_data_gen = data_gen
_ = converter.convert_to_tflite(output_file='yolox_s_quant.tflite')

FP32 Conversion Process

The following script demonstrates how to convert the YOLOX model to a non-quantized (FP32) TFLite format:

Data Generation: As in the quantization process, a generator function creates random input data for the conversion.
Model Loading: The YOLOX model is loaded from a TorchScript file.
Conversion: The model is converted to TFLite format without applying quantization, and the output is saved as yolox_s.tflite.

import mtk_converter
import numpy as np

def data_gen():
    for i in range(100):
        yield [np.random.randn(1, 3, 640, 640).astype(np.float32)]

converter = mtk_converter.PyTorchConverter.from_script_module_file(
    'yolox.torchscript.pt',  [[1, 3, 640, 640]],
)
converter.input_value_ranges = [(-1.0, 1.0)]
converter.calibration_data_gen = data_gen
_ = converter.convert_to_tflite(output_file='yolox_s.tflite')

Model Details

General Information

Property	Value
Category	Classification
Input Size	640x640
FLOPs (G)	26.8
#Params (M)	9.0
Training Framework	PyTorch
Inference Framework	TFLite
Quant8 Model package	Download
Float32 Model package	Download

Model Properties

Quant8

Format: TensorFlow Lite v3
Description: Exported by NeuroPilot converter v7.14.1+release

Inputs

Property	Value
Name	x.2
Tensor	int8[1,3,640,640]
Identifier	318
Quantization	Linear
Quantization Range	-1.0039 ≤ 0.0078 * q ≤ 0.9961

Outputs

Property	Value
Name	1360
Tensor	int8[1,8400,85]
Identifier	98
Quantization	Linear
Quantization Range	-2.1799 ≤ 0.0227 * (q + 32) ≤ 3.6106

Fp32

Format: TensorFlow Lite v3
Description: Exported by NeuroPilot converter v7.14.1+release

Inputs

Property	Value
Name	x.2
Tensor	float32[1,3,640,640]
Identifier	111

Outputs

Property	Value
Name	1360
Tensor	float32[1,8400,85]
Identifier	53

Performance Benchmarks

YOLOX_s-quant8

Run model (.tflite) 10 times	CPU (Thread:8)	GPU	ARMNN(GpuAcc)	ARMNN(CpuAcc)	Neuron Stable Delegate(APU)	APU(MDLA)	APU(VPU)
G350	976.244 ms (Thread:4)	1160.05 ms	703.429 ms	694.960 ms	N/A	N/A	106683 ms
G510	378.448 ms	386.428 ms	217.080 ms	174.013 ms	N/A	22.31 ms	N/A
G700	165.861 ms	265.292 ms	150.151 ms	157.308 ms	14.723 ms	15.04 ms	N/A
G1200	159.647 ms	179.024 ms	98.150 ms	87.183 ms	24.276 ms	25.05 ms	N/A

YOLOX_s-fp32

Run model (.tflite) 10 times	CPU (Thread:8)	GPU	ARMNN(GpuAcc)	ARMNN(CpuAcc)	Neuron Stable Delegate(APU)	APU(MDLA)	APU(VPU)
G350	2027.11 ms (Thread:4)	1121.93 ms	1046.59 ms	N/A	N/A	N/A	7066.75 ms
G510	821.662 ms	358.816 ms	341.251 ms	424.459 ms	69.916 ms	62.12 ms	N/A
G700	423.192 ms	248.299 ms	234.901 ms	358.942 ms	51.868 ms	44.04 ms	N/A
G1200	395.406 ms	161.877 ms	150.498 ms	205.845 ms	57.381 ms	48.05 ms	N/A

Widespread: CPU only, light workload.
Performance: CPU and GPU, medium workload.
Ultimate: CPU, GPU, and APUs, heavy workload.

Resources

To preview related documentation about YOLOX, please visit the Github repository