I am using MQBench(Model Quantization Benchmark)(http://mqbench.tech/) to quantize the model for deployment.
MQBench is a benchmark and framework for evluating the quantization algorithms under real world hardware deployments.
- Python 3.7+
- PyTorch == 1.8.1
Before run this repository, you should install MQBench:
Notice that MQBench version is 0.0.2.
git clone https://github.com/ZLkanyo009/MQBench.git
cd MQBench
python setup.py build
python setup.py install
# Start training fp32 model with:
# model_name can be ResNet18, MobileNet, ...
python main.py model_name
# You can manually config the training with:
python main.py --resume --lr=0.01
# Start training quantize model with:
# model_name can be ResNet18, MobileNet, ...
python main.py model_name --quantize
# You can manually config the training with:
python main.py --resume --parallel DP --BackendType Tensorrt --quantize
python -m torch.distributed.launch main.py --local_rank 0 --parallel DDP --resume --BackendType Tensorrt --quantize
Model | Acc.(fp32) | Acc.(tensorRT) |
---|---|---|
VGG16 | 79.90% | 78.95% |
GoogleNet | 90.20% | 89.42% |
ResNet18 | 95.43% | 95.44% |
RegNetX_200MF | 89.47% | 89.22% |
SENet18 | 91.69% | 91.34% |
MobileNetV2 | 88.42% | 87.65% |
ResNeXt29(2x64d) | 87.07% | 86.95% |
SimpleDLA | 90.24% | 89.45% |
DenseNet121 | 85.18% | 85.10% |
PreActResNet18 | 92.06% | 91.68% |