Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

Last update: Dec 30, 2022

Overview

HAWQ: Hessian AWare Quantization

HAWQ is an advanced quantization library written for PyTorch. HAWQ enables low-precision and mixed-precision uniform quantization, with direct hardware implementation through TVM.

For more details please see:

Installation

PyTorch version >= 1.4.0
Python version >= 3.6
For training new models, you'll also need NVIDIA GPUs and NCCL
To install HAWQ and develop locally:

git clone https://github.com/Zhen-Dong/HAWQ.git
cd HAWQ
pip install -r requirements.txt

Getting Started

Quantization-Aware Training

An example to run uniform 8-bit quantization for resnet50 on ImageNet.

export CUDA_VISIBLE_DEVICES=0
python quant_train.py -a resnet50 --epochs 1 --lr 0.0001 --batch-size 128 --data /path/to/imagenet/ --pretrained --save-path /path/to/checkpoints/ --act-range-momentum=0.99 --wd 1e-4 --data-percentage 0.0001 --fix-BN --checkpoint-iter -1 --quant-scheme uniform8

The commands for other quantization schemes and for other networks are shown in the model zoo.

Inference Acceleration

Instructions on Hardware Implementation through TVM

Experimental Results

Table I and Table II in HAWQ-V3: Dyadic Neural Network Quantization

ResNet18 on ImageNet

Model	Quantization	Model Size(MB)	BOPS(G)	Accuracy(%)	Inference Speed (batch=8, ms)	Download
`ResNet18`	Floating Points	44.6	1858	71.47	9.7 (1.0x)	resnet18_baseline
`ResNet18`	W8A8	11.1	116	71.56	3.3 (3.0x)	resnet18_uniform8
`ResNet18`	Mixed Precision	6.7	72	70.22	2.7 (3.6x)	resnet18_bops0.5
`ResNet18`	W4A4	5.8	34	68.45	2.2 (4.4x)	resnet18_uniform4

ResNet50 on ImageNet

Model	Quantization	Model Size(MB)	BOPS(G)	Accuracy(%)	Inference Speed (batch=8, ms)	Download
`ResNet50`	Floating Points	97.8	3951	77.72	26.2 (1.0x)	resnet50_baseline
`ResNet50`	W8A8	24.5	247	77.58	8.5 (3.1x)	resnet50_uniform8
`ResNet50`	Mixed Precision	18.7	154	75.39	6.9 (3.8x)	resnet50_bops0.5
`ResNet50`	W4A4	13.1	67	74.24	5.8 (4.5x)	resnet50_uniform4

More results for different quantization schemes and different models (also the corresponding commands and important notes) are available in the model zoo.
To download the quantized models through wget, please refer to a simple command in model zoo.
Checkpoints in model zoo are saved in floating point precision. To shrink the memory size, BitPack can be applied on weight_integer tensors, or directly on quantized_checkpoint.pth.tar file.

Related Works

License

HAWQ is released under the MIT license.

Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

Related tags

Overview

HAWQ: Hessian AWare Quantization

Installation

Getting Started

Quantization-Aware Training

Inference Acceleration

Experimental Results

ResNet18 on ImageNet

ResNet50 on ImageNet

Related Works

License

Owner

Zhen Dong

PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

Reimplementation of Learning Mesh-based Simulation With Graph Networks

Haze Removal can remove slight to extreme cases of haze affecting an image

A toolkit for developing and comparing reinforcement learning algorithms.

CATE: Computation-aware Neural Architecture Encoding with Transformers

The official repository for "Revealing unforeseen diagnostic image features with deep learning by detecting cardiovascular diseases from apical four-chamber ultrasounds"

The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track.

A brand new hub for Scene Graph Generation methods based on MMdetection (2021). The pipeline of from detection, scene graph generation to downstream tasks (e.g., image cpationing) is supported. Pytorch version implementation of HetH (ECCV 2020) and TopicSG (ICCV 2021) is included.

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning

Pneumonia Detection using machine learning - with PyTorch

A modified version of DeepMind's Alphafold2 to divide CPU part (MSA and template searching) and GPU part (prediction model)

Providing the solutions for high-frequency trading (HFT) strategies using data science approaches (Machine Learning) on Full Orderbook Tick Data.

Unofficial Implementation of MLP-Mixer in TensorFlow

📝 Wrapper library for text generation / language models at char and word level with RNN in TensorFlow

Pre-trained Deep Learning models and demos (high quality and extremely fast)

Implementation of ProteinBERT in Pytorch

SweiNet is an uncertainty-quantifying shear wave speed (SWS) estimator for ultrasound shear wave elasticity (SWE) imaging.

Source code for GNN-LSPE (Graph Neural Networks with Learnable Structural and Positional Representations)

Tensorflow implementation for "Improved Transformer for High-Resolution GANs" (NeurIPS 2021).