The below tables are models enabled by the Intel® Low Precision Optimization Tool.
Framework |
Version |
Model |
Dataset |
Accuracy | Performance speed up | ||
---|---|---|---|---|---|---|---|
INT8 Tuning Accuracy |
FP32 Accuracy Baseline |
Acc Ratio [(INT8-FP32)/FP32] |
Realtime Latency Ratio[FP32/INT8] |
||||
tensorflow |
2.4.0 |
resnet50v1.0 |
ImageNet |
73.80% |
74.30% |
-0.67% |
3.49x |
tensorflow |
2.4.0 |
resnet50v1.5 |
ImageNet |
76.70% |
76.50% |
0.26% |
3.23x |
tensorflow |
2.4.0 |
resnet101 |
ImageNet |
77.20% |
76.40% |
1.05% |
2.42x |
tensorflow |
2.4.0 |
inception_v1 |
ImageNet |
70.10% |
69.70% |
0.57% |
1.88x |
tensorflow |
2.4.0 |
inception_v2 |
ImageNet |
74.10% |
74.00% |
0.14% |
1.96x |
tensorflow |
2.4.0 |
inception_v3 |
ImageNet |
77.20% |
76.70% |
0.65% |
2.36x |
tensorflow |
2.4.0 |
inception_v4 |
ImageNet |
80.00% |
80.30% |
-0.37% |
2.59x |
tensorflow |
2.4.0 |
inception_resnet_v2 |
ImageNet |
80.10% |
80.40% |
-0.37% |
1.97x |
tensorflow |
2.4.0 |
mobilenetv1 |
ImageNet |
71.10% |
71.00% |
0.14% |
2.88x |
tensorflow |
2.4.0 |
mobilenetv2 |
ImageNet |
70.80% |
71.80% |
-1.39% |
1.60x |
tensorflow |
2.4.0 |
ssd_resnet50_v1 |
Coco2017 |
37.90% |
38.00% |
-0.26% |
2.97x |
tensorflow |
2.4.0 |
mask_rcnn_inception_v2 |
Coco2017 |
28.90% |
29.10% |
-0.69% |
2.66x |
tensorflow |
2.4.0 |
wide_deep_large_ds |
criteo-kaggle |
77.61% |
77.67% |
-0.08% |
1.42x |
tensorflow |
2.4.0 |
vgg16 |
ImageNet |
72.50% |
70.90% |
2.26% |
3.75x |
tensorflow |
2.4.0 |
vgg19 |
ImageNet |
72.40% |
71.00% |
1.97% |
3.79x |
tensorflow |
2.4.0 |
resnetv2_50 |
ImageNet |
70.30% |
69.60% |
1.01% |
1.38x |
tensorflow |
2.4.0 |
resnetv2_101 |
ImageNet |
72.50% |
71.90% |
0.83% |
1.44x |
tensorflow |
2.4.0 |
resnetv2_152 |
ImageNet |
72.60% |
72.40% |
0.28% |
1.53x |
tensorflow |
2.4.0 |
densenet121 |
ImageNet |
72.60% |
72.90% |
-0.41% |
1.49x |
tensorflow |
2.4.0 |
densenet161 |
ImageNet |
76.10% |
76.30% |
-0.26% |
1.64x |
tensorflow |
2.4.0 |
densenet169 |
ImageNet |
74.20% |
74.60% |
-0.54% |
1.47x |
Framework |
Version |
Model |
Dataset |
Accuracy |
Performance speed up |
||
---|---|---|---|---|---|---|---|
INT8 Tuning Accuracy |
FP32 Accuracy Baseline |
Acc Ratio [(INT8-FP32)/FP32] |
Realtime Latency Ratio[FP32/INT8] |
||||
tensorflow |
1.15UP2 |
resnet_v1_50_slim |
ImageNet |
76.30% |
75.20% |
1.46% |
2.89x |
tensorflow |
1.15UP2 |
resnet_v1_101_slim |
ImageNet |
77.10% |
76.40% |
0.92% |
3.25x |
tensorflow |
1.15UP2 |
resnet_v1_152_slim |
ImageNet |
77.40% |
76.80% |
0.78% |
3.51x |
tensorflow |
1.15UP2 |
inception_v1_slim |
ImageNet |
70.10% |
69.80% |
0.43% |
1.79x |
tensorflow |
1.15UP2 |
inception_v2_slim |
ImageNet |
74.10% |
74.00% |
0.14% |
1.95x |
tensorflow |
1.15UP2 |
inception_v3_slim |
ImageNet |
78.10% |
78.00% |
0.13% |
2.48x |
tensorflow |
1.15UP2 |
inception_v4_slim |
ImageNet |
79.90% |
80.20% |
-0.37% |
2.78x |
tensorflow |
1.15UP2 |
vgg16_slim |
ImageNet |
72.50% |
70.90% |
2.26% |
3.73x |
tensorflow |
1.15UP2 |
vgg19_slim |
ImageNet |
72.40% |
71.00% |
1.97% |
3.82x |
tensorflow |
1.15UP2 |
resnetv2_50_slim |
ImageNet |
70.30% |
69.70% |
0.86% |
1.38x |
tensorflow |
1.15UP2 |
resnetv2_101_slim |
ImageNet |
72.30% |
71.90% |
0.56% |
1.50x |
tensorflow |
1.15UP2 |
resnetv2_152_slim |
ImageNet |
72.60% |
72.40% |
0.28% |
1.57x |
tensorflow |
1.15UP2 |
bert |
SQUAD |
92.33% |
92.98% |
-0.69% |
2.89x |
Framework | Version | Model | Dataset | Accuracy | Performance speed up | ||
---|---|---|---|---|---|---|---|
INT8 Tuning Accuracy | FP32 Accuracy Baseline | Acc Ratio [(INT8-FP32)/FP32] | Realtime Latency Ratio[FP32/INT8] | ||||
pytorch | 1.5.0+cpu | resnet18 | ImageNet | 69.60% | 69.76% | -0.22% | 1.76x |
pytorch | 1.5.0+cpu | resnet50 | ImageNet | 75.96% | 76.13% | -0.23% | 2.63x |
pytorch | 1.5.0+cpu | resnext101_32x8d | ImageNet | 79.12% | 79.31% | -0.24% | 2.61x |
pytorch | 1.6.0a0+24aac32 | bert_base_mrpc | MRPC | 88.90% | 88.73% | 0.19% | 1.98x |
pytorch | 1.6.0a0+24aac32 | bert_base_cola | COLA | 59.06% | 58.84% | 0.37% | 2.19x |
pytorch | 1.6.0a0+24aac32 | bert_base_sts-b | STS-B | 88.40% | 89.27% | -0.97% | 2.28x |
pytorch | 1.6.0a0+24aac32 | bert_base_sst-2 | SST-2 | 91.51% | 91.86% | -0.37% | 2.30x |
pytorch | 1.6.0a0+24aac32 | bert_base_rte | RTE | 69.31% | 69.68% | -0.52% | 2.16x |
pytorch | 1.6.0a0+24aac32 | bert_large_mrpc | MRPC | 87.45% | 88.33% | -0.99% | 2.63x |
pytorch | 1.6.0a0+24aac32 | bert_large_squad | SQUAD | 92.85% | 93.05% | -0.21% | 2.01x |
pytorch | 1.6.0a0+24aac32 | bert_large_qnli | QNLI | 91.20% | 91.82% | -0.68% | 2.69x |
pytorch | 1.6.0a0+24aac32 | bert_large_rte | RTE | 71.84% | 72.56% | -0.99% | 1.36x |
pytorch | 1.6.0a0+24aac32 | bert_large_cola | COLA | 62.74% | 62.57% | 0.27% | 2.74x |
pytorch | 1.5.0+cpu | dlrm | CriteoTerabyte | 80.27% | 80.27% | 0.00% | 1.03x |
pytorch | 1.5.0+cpu | inception_v3 | ImageNet | 69.42% | 69.54% | -0.17% | 1.84x |
pytorch | 1.5.0+cpu | peleenet | ImageNet | 71.59% | 72.08% | -0.68% | 1.28x |
pytorch | 1.5.0+cpu | yolo_v3 | Coco2017 | 24.42% | 24.54% | -0.51% | 1.64x |
pytorch | 1.5.0+cpu | se_resnext50_32x4d | ImageNet | 79.04% | 79.08% | -0.05% | 1.73x |
pytorch | 1.5.0+cpu | mobilenet_v2 | ImageNet | 70.63% | 71.86% | -1.70% | 1.60x |
pytorch | 1.5.0+cpu | gpt_wikitext | WIKI Text | 60.06% | 60.20% | -0.23% | 1.15x |
pytorch | 1.5.0+cpu | roberta_base_mrpc | MRPC | 85.08% | 85.51% | -0.51% | 2.12x |
pytorch | 1.5.0+cpu | camembert_base_mrpc | MRPC | 83.57% | 84.22% | -0.77% | 2.16x |
pytorch | 1.6.0+cpu | blendcnn | MRPC | 68.40% | 68.40% | 0.00% | 1.50x |
pytorch | ipex | resnet50_ipex | ImageNet | 75.80% | 76.13% | -0.44% | 1.66x |
Framework | version | model | dataset | Accuracy | Performance speed up | ||
---|---|---|---|---|---|---|---|
INT8 Tuning Accuracy | FP32 Accuracy Baseline | Acc Ratio[(INT8-FP32)/FP32] | Realtime Latency Ratio[FP32/INT8] | ||||
pytorch | 1.5.0+cpu | resnet18_qat | ImageNet | 69.76% | 69.76% | 0.01% | 1.76x |
pytorch | 1.5.0+cpu | resnet50_qat | ImageNet | 76.37% | 76.13% | 0.32% | 2.67x |
Framework | Version | Model | Dataset | Accuracy | Performance speed up | ||
---|---|---|---|---|---|---|---|
INT8 Tuning Accuracy | FP32 Accuracy Baseline | Acc Ratio [(INT8-FP32)/FP32] | Realtime Latency Ratio[FP32/INT8] | ||||
mxnet | 1.7.0 | resnet50v1 | ImageNet | 76.03% | 76.33% | -0.39% | 3.13x |
mxnet | 1.7.0 | inceptionv3 | ImageNet | 77.80% | 0.21% | 2.77x | |
mxnet | 1.7.0 | mobilenet1.0 | ImageNet | 71.71% | 72.22% | -0.71% | 2.38x |
mxnet | 1.7.0 | mobilenetv2_1.0 | ImageNet | 70.77% | 70.87% | -0.14% | 2.67x |
mxnet | 1.7.0 | resnet18_v1 | ImageNet | 70.00% | 70.14% | -0.21% | 3.13x |
mxnet | 1.7.0 | squeezenet1.0 | ImageNet | 56.89% | 56.96% | -0.13% | 2.63x |
mxnet | 1.7.0 | ssd-mobilenet1.0 | VOC | 74.94% | 75.54% | -0.79% | 3.74x |
mxnet | 1.7.0 | resnet152_v1 | ImageNet | 78.31% | 78.54% | -0.29% | 3.14x |
Framework | Version | Model | Dataset | Accuracy | ||
---|---|---|---|---|---|---|
INT8 Tuning Accuracy | FP32 Accuracy Baseline | Acc Ratio [(INT8-FP32)/FP32] | ||||
ONNX RT | 1.6.0 (opset11+) | resnet50_v1_5 | ImageNet | 73.60% | 74.00% | -0.54% |
ONNX RT | vgg16 | ImageNet | 68.86% | 69.44% | -0.84% | |
ONNX RT | bert_base_mrpc | MRPC | 85.29% | 86.03% | -0.85% | |
ONNX RT | MobileBERT | MRPC | 0.8603 | 0.8627 | -0.28% | |
ONNX RT | RoBERTa | MRPC | 0.8873 | 0.8946 | -0.82% | |
ONNX RT | DistilBERT | MRPC | 0.8505 | 0.8456 | 0.58% |