The Intel® LPOT library is released as part of the Intel® oneAPI AI Analytics Toolkit (AI Kit). The AI Kit provides a consolidated package of Intel's latest deep learning and machine optimizations all in one place for ease of development. Along with LPOT, the AI Kit includes Intel-optimized versions of deep learning frameworks (such as TensorFlow and PyTorch) and high-performing Python libraries to streamline end-to-end data science and AI workflows on Intel architectures.
You can install just the LPOT library from binary or source, or you can get the Intel-optimized framework together with the LPOT library by installing the Intel® oneAPI AI Analytics Toolkit.
# install from pip
pip install lpot
# install from conda
conda install lpot -c conda-forge -c intel
git clone https://github.com/intel/lpot.git
cd lpot
pip install -r requirements.txt
python setup.py install
The AI Kit, which includes the LPOT library, is distributed through many common channels, including from Intel's website, YUM, APT, Anaconda, and more. Select and download the AI Kit distribution package that's best suited for you and follow the Get Started Guide for post-installation instructions.
Download AI Kit | AI Kit Get Started Guide |
---|
Prerequisites
The following prerequisites and requirements must be satisfied for a successful installation:
-
Python version: 3.6 or 3.7 or 3.8
-
Download and install anaconda.
-
Create a virtual environment named lpot in anaconda:
# Here we install python 3.7 for instance. You can also choose python 3.6 & 3.8. conda create -n lpot python=3.7 conda activate lpot
# install from pip
pip install lpot
# install from conda
conda install lpot -c conda-forge -c intel
git clone https://github.com/intel/lpot.git
cd lpot
pip install -r requirements.txt
python setup.py install
Read the following resources to learn how to use LPOT.
The Tutorial provides comprehensive instructions on how to utilize Intel® Low Precision Optimization Tool's features with examples.
Examples are provided to demonstrate the usage of Intel® Low Precision Optimization Tool in different frameworks: TensorFlow, PyTorch, MXNet, and ONNX Runtime. Hello World examples are also available.
View LPOT Documentation for getting started, deep dive, and advanced resources to help you use and develop LPOT.
Intel® Low Precision Optimization Tool supports systems based on Intel 64 architecture or compatible processors, specially optimized for the following CPUs:
- Intel Xeon Scalable processor (formerly Skylake, Cascade Lake, and Cooper Lake)
- future Intel Xeon Scalable processor (code name Sapphire Rapids)
Intel® Low Precision Optimization Tool requires installing the pertinent Intel-optimized framework version for TensorFlow, PyTorch, and MXNet.
Platform | OS | Python | Framework | Version |
---|---|---|---|---|
Cascade Lake Cooper Lake Skylake |
CentOS 7.8 Ubuntu 18.04 |
3.6 3.7 3.8 |
TensorFlow | 2.4.0 |
2.2.0 | ||||
1.15.0 UP1 | ||||
1.15.0 UP2 | ||||
2.3.0 | ||||
2.1.0 | ||||
1.15.2 | ||||
PyTorch | 1.5.0+cpu | |||
1.6.0+cpu | ||||
IPEX | ||||
MXNet | 1.7.0 | |||
1.6.0 | ||||
ONNX Runtime | 1.6.0 |
Intel® Low Precision Optimization Tool provides numerous examples to show promising accuracy loss with the best performance gain. A full quantized model list on various frameworks is available in the Model List.
Framework | version | Model | dataset | Accuracy | Performance speed up | ||
---|---|---|---|---|---|---|---|
INT8 Tuning Accuracy | FP32 Accuracy Baseline | Acc Ratio[(INT8-FP32)/FP32] | Realtime Latency Ratio[FP32/INT8] | ||||
tensorflow | 2.4.0 | resnet50v1.5 | ImageNet | 76.70% | 76.50% | 0.26% | 3.23x |
tensorflow | 2.4.0 | Resnet101 | ImageNet | 77.20% | 76.40% | 1.05% | 2.42x |
tensorflow | 2.4.0 | inception_v1 | ImageNet | 70.10% | 69.70% | 0.57% | 1.88x |
tensorflow | 2.4.0 | inception_v2 | ImageNet | 74.10% | 74.00% | 0.14% | 1.96x |
tensorflow | 2.4.0 | inception_v3 | ImageNet | 77.20% | 76.70% | 0.65% | 2.36x |
tensorflow | 2.4.0 | inception_v4 | ImageNet | 80.00% | 80.30% | -0.37% | 2.59x |
tensorflow | 2.4.0 | inception_resnet_v2 | ImageNet | 80.10% | 80.40% | -0.37% | 1.97x |
tensorflow | 2.4.0 | Mobilenetv1 | ImageNet | 71.10% | 71.00% | 0.14% | 2.88x |
tensorflow | 2.4.0 | ssd_resnet50_v1 | Coco | 37.90% | 38.00% | -0.26% | 2.97x |
tensorflow | 2.4.0 | mask_rcnn_inception_v2 | Coco | 28.90% | 29.10% | -0.69% | 2.66x |
tensorflow | 2.4.0 | vgg16 | ImageNet | 72.50% | 70.90% | 2.26% | 3.75x |
tensorflow | 2.4.0 | vgg19 | ImageNet | 72.40% | 71.00% | 1.97% | 3.79x |
Framework | version | model | dataset | Accuracy | Performance speed up | ||
---|---|---|---|---|---|---|---|
INT8 Tuning Accuracy | FP32 Accuracy Baseline | Acc Ratio[(INT8-FP32)/FP32] | Realtime Latency Ratio[FP32/INT8] | ||||
pytorch | 1.5.0+cpu | resnet50 | ImageNet | 75.96% | 76.13% | -0.23% | 2.63x |
pytorch | 1.5.0+cpu | resnext101_32x8d | ImageNet | 79.12% | 79.31% | -0.24% | 2.61x |
pytorch | 1.6.0a0+24aac32 | bert_base_mrpc | MRPC | 88.90% | 88.73% | 0.19% | 1.98x |
pytorch | 1.6.0a0+24aac32 | bert_base_cola | COLA | 59.06% | 58.84% | 0.37% | 2.19x |
pytorch | 1.6.0a0+24aac32 | bert_base_sts-b | STS-B | 88.40% | 89.27% | -0.97% | 2.28x |
pytorch | 1.6.0a0+24aac32 | bert_base_sst-2 | SST-2 | 91.51% | 91.86% | -0.37% | 2.30x |
pytorch | 1.6.0a0+24aac32 | bert_base_rte | RTE | 69.31% | 69.68% | -0.52% | 2.15x |
pytorch | 1.6.0a0+24aac32 | bert_large_mrpc | MRPC | 87.45% | 88.33% | -0.99% | 2.73x |
pytorch | 1.6.0a0+24aac32 | bert_large_squad | SQUAD | 92.85% | 93.05% | -0.21% | 2.01x |
pytorch | 1.6.0a0+24aac32 | bert_large_qnli | QNLI | 91.20% | 91.82% | -0.68% | 2.69x |