AIMET Model Zoo: Highly accurate quantized AI models are now available

8-bit integer models using the AI Model Efficiency Toolkit

Jan 22, 2021

Qualcomm products mentioned within this post are offered by Qualcomm Technologies, Inc. and/or its subsidiaries.

Making neural network models smaller is crucial for the widespread deployment of AI. Qualcomm AI Research has been developing state-of-the-art quantization techniques that enable power-efficient fixed-point inference while preserving model accuracy, such as Data Free Quantization (DFQ) and AdaRound, which are post-training techniques that achieve accurate 8-bit quantization without data.

To make this research more accessible and contribute to the open-source community, Qualcomm Innovation Center (QuIC) launched the AI Model Efficiency Toolkit (AIMET) on GitHub in May 2020. AIMET's goal is to enable power efficient integer inference by providing a simple library plugin for AI developers to utilize for state-of-the-art model efficiency performance. The AIMET project is flourishing with regularly updated quantization techniques based on work from Qualcomm AI Research and active use by the broader AI community, including multiple mobile OEMs, ISVs, and researchers in academia.

Leading quantization research is quickly being open sourced.

Click for larger image

QuIC is now taking it a step further by contributing a collection of popular pre-trained models optimized for 8-bit inference to GitHub in the form of 'AIMET Model Zoo.' Together with the models, AIMET Model Zoo also provides the recipe for quantizing popular 32-bit floating point (FP32) models to 8-bit integer (INT8) models with little loss in accuracy. The tested and verified recipes include a script that optimizes TensorFlow or PyTorch models across a broad range of categories from image classification, object detection, semantic segmentation, and pose estimation to super resolution, and speech recognition.

AIMET Model Zoo provides 8-bit quantized models for a variety of categories.

Click to see a larger image

This will allow researchers and developers direct access to highly accurate quantized models, saving them time in achieving performance benefits like reduced energy consumption, latency, and memory requirements for on-target inference. For example, imagine you are a developer wanting to do semantic segmentation for image beautification or autonomous driving use cases by using DeepLabv3+ model. AIMET Model Zoo provides an optimized DeepLabv3+ model using the DFQ and Quantization Aware Training (QAT) features from AIMET. The corresponding AIMET Model Zoo recipe points to this optimized model and provides proper calls to the AIMET library to run INT8 simulation and assess performance. In fact, the AIMET quantized version has a Mean Intersection over Union (mIoU) score of 72.08%, which is virtually equivalent to the 72.32% provided by the original FP32 model. The image below visually shows how the quantized model in AIMET Model Zoo results in accurate semantic segmentation.

Side-by-side comparison of FP32 model, 8-bit quantized AIMET model, and 8-bit quantized baseline model for DeepLabv3+ semantic segmentation. AIMET quantization results in accurate quantization, while the baseline quantization method is inaccurate.

Click to see a larger image

This is one example. The AIMET Model Zoo has many INT8 quantized neural network models that provide accurate inference comparable to FP32 models. With this initial contribution of 14 INT8 models to AIMET Model Zoo, we are easing the hurdles for the ecosystem in using quantized models in their AI workloads and thus marching toward making fixed-point power-efficient inference ubiquitous. You can get the best of both worlds -- the high accuracy of a floating-point model and the model efficiency of 8-bit integer models.

Check ourAIMET Model ZooandAIMET.

Sign up for our newsletter to receive the latest information about mobile computing

Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc. AIMET and AIMET Model Zoo are products of Qualcomm Innovation Center, Inc.
SnapdragonDeveloperArtificial Intelligence

Engage with us on

Twitter and LinkedIn

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries ('Qualcomm'). Qualcomm products mentioned within this post are offered by Qualcomm Technologies, Inc. and/or its subsidiaries. The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Chirag Patel

Engineer, Principal/Mgr., Qualcomm Technologies

More articles from this author

About this author

Attachments

  • Original document
  • Permalink

Disclaimer

Qualcomm Inc. published this content on 22 January 2021 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 23 January 2021 03:03:02 UTC