Post-training Quantization for Neural Networks with Provable Guarantees

Overview

Post-training Quantization for Neural Networks with Provable Guarantees

Authors: Jinjie Zhang ([email protected]), Yixuan Zhou ([email protected]) and Rayan Saab ([email protected])

Overview

This directory contains code necessary to run a post-training neural-network quantization method GPFQ, that is based on a greedy path-following mechanism. One can also use it to reproduce the experiment results in our paper "Post-training Quantization for Neural Networks with Provable Guarantees". In this paper, we also prove theoretical guarantees for the proposed method, that is, for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights – i.e., level of over-parametrization.

If you make use of this code or our quantization method in your work, please cite the following paper:

 @article{zhang2022posttraining,
     author = {Zhang, Jinjie and Zhou, Yixuan and Saab, Rayan},
     title = {Post-training Quantization for Neural Networks with Provable Guarantees},
     booktitle = {arXiv preprint arXiv:2201.11113},
     year = {2022}
   }

Note: The code is designed to work primarily with the ImageNet dataset. Due to the size of this dataset, it is likely one may need heavier computational resources than a local machine. Nevertheless, the experiments can be run, for example, using a cloud computation center, e.g. AWS. When we run this experiment, we use the m5.8xlarge EC2 instance with a disk space of 300GB.

Installing Dependencies

We assume a python version that is greater than 3.8.0 is installed in the user's machine. In the root directory of this repo, we provide a requirements.txt file for installing the python libraries that will be used in our code.

To install the necessary dependency, one can first start a virtual environment by doing the following:

python3 -m venv .venv
source .venv/bin/activate

The code above should activate a new python virtual environments.

Then one can make use of the requirements.txt by

pip3 install -r requirement.txt

This should install all the required dependencies of this project.

Obtaining ImageNet Dataset

In this project, we make use of the Imagenet dataset, in particular, we use the ILSVRC-2012 version.

To obtain the Imagenet dataset, one can submit a request through this link.

Once the dataset is obtained, place the .tar files for training set and validation set both under the data/ILSVRC2012 directory of this repo.

Then use the following procedure to unzip Imagenet dataset:

tar -xvf ILSVRC2012_img_train.tar && rm -f ILSVRC2012_img_train.tar
find . -name "*.tar" | while read NAME ; do mkdir -p "${NAME%.tar}"; tar -xvf "${NAME}" -C "${NAME%.tar}"; rm -f "${NAME}"; done
cd ..
# Extract the validation data and move images to subfolders:
tar -xvf ILSVRC2012_img_val.tar

Running Experiments

The implementation of the modified GPFQ in our paper is contained in quantization_scripts. Additionally, adhoc_quantization_scripts and retraining_scripts provide extra experiments and both of them are variants of the framework in quantization_scripts. adhoc_quantization_scripts contains heuristic modifications used to further improve the performance of GPFQ, such as bias correction, mixed precision, and unquantizing the last layer. retraining_scripts shows a quantization-aware training strategy that is designed to retrain the neural network after each layer is quantized.

In this section, we will give a guidance on running our code contained in quantization_scripts and the implementation of other two counterparts adhoc_quantization_scripts and retraining_scripts are very similar to quantization_scripts.

  1. Before getting started, run in the root directory of the repo and run mkdir modelsto create a directory in which we will store the quantized model.

  2. The entry point of the project starts with quantization_scripts/quantize.py. Once the file is opened, there is a section to set hyperparameters, for example, the model_name parameter, the number of bits/batch size used for quantization, the scalar of alphabets, the probability for subsampling in CNNs etc. Note that the model_name mentioned above should be the same as the model that you will quantize. After you selected a model_name and assuming you are still in the root directory of this repo, run mkdir models/{model_name}, where the {model_name} should be the python string that you provided for the model_name parameter in the quantize.py file. If the directory already exists, you can skip this step.

  3. Then navigate to the logs directory and run python3 init_logs.py. This will prepare a log file which is used to store the results of the experiment.

  4. Finally, open the quantization_scripts directory and run python3 quantize.py to start the experiment.

Owner
Yixuan Zhou
3rd Year UCSD CS double Math undergrad.
Yixuan Zhou
Official code for "EagerMOT: 3D Multi-Object Tracking via Sensor Fusion" [ICRA 2021]

EagerMOT: 3D Multi-Object Tracking via Sensor Fusion Read our ICRA 2021 paper here. Check out the 3 minute video for the quick intro or the full prese

Aleksandr Kim 276 Dec 30, 2022
This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

212 Dec 25, 2022
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

English | 简体中文 Easy Parallel Library Overview Easy Parallel Library (EPL) is a general and efficient library for distributed model training. Usability

Alibaba 185 Dec 21, 2022
MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

page_type languages products description sample python azure azure-machine-learning-service azure-devops Code which demonstrates how to set up and ope

1 Nov 01, 2021
Revisting Open World Object Detection

Revisting Open World Object Detection Installation See INSTALL.md. Dataset Our n

58 Dec 23, 2022
Complete U-net Implementation with keras

U Net Lowered with Keras Complete U-net Implementation with keras Original Paper Link : https://arxiv.org/abs/1505.04597 Special Implementations : The

Sagnik Roy 14 Oct 10, 2022
Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)

Baleen Baleen is a state-of-the-art model for multi-hop reasoning, enabling scalable multi-hop search over massive collections for knowledge-intensive

Stanford Future Data Systems 22 Dec 05, 2022
Visual Tracking by TridenAlign and Context Embedding

Visual Tracking by TridentAlign and Context Embedding (TACT) Test code for "Visual Tracking by TridentAlign and Context Embedding" Janghoon Choi, Juns

Janghoon Choi 32 Aug 25, 2021
DLL: Direct Lidar Localization

DLL: Direct Lidar Localization Summary This package presents DLL, a direct map-based localization technique using 3D LIDAR for its application to aeri

Service Robotics Lab 127 Dec 16, 2022
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

AST: Audio Spectrogram Transformer Introduction Citing Getting Started ESC-50 Recipe Speechcommands Recipe AudioSet Recipe Pretrained Models Contact I

Yuan Gong 603 Jan 07, 2023
A generalized framework for prototyping full-stack cooperative driving automation applications under CARLA+SUMO.

OpenCDA OpenCDA is a SIMULATION tool integrated with a prototype cooperative driving automation (CDA; see SAE J3216) pipeline as well as regular autom

UCLA Mobility Lab 726 Dec 29, 2022
New AidForBlind - Various Libraries used like OpenCV and other mentioned in Requirements.txt

AidForBlind Recommended PyCharm IDE Various Libraries used like OpenCV and other

Aalhad Chandewar 1 Jan 13, 2022
MMGeneration is a powerful toolkit for generative models, based on PyTorch and MMCV.

Documentation: https://mmgeneration.readthedocs.io/ Introduction English | 简体中文 MMGeneration is a powerful toolkit for generative models, especially f

OpenMMLab 1.3k Dec 29, 2022
ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton (AAAI 2022)

ShuttleNet: Position-aware Rally Progress and Player Styles Fusion for Stroke Forecasting in Badminton (AAAI 2022) Official code of the paper ShuttleN

Wei-Yao Wang 11 Nov 30, 2022
LAMDA: Label Matching Deep Domain Adaptation

LAMDA: Label Matching Deep Domain Adaptation This is the implementation of the paper LAMDA: Label Matching Deep Domain Adaptation which has been accep

Tuan Nguyen 9 Sep 06, 2022
Random Forests for Regression with Missing Entries

Random Forests for Regression with Missing Entries These are specific codes used in the article: On the Consistency of a Random Forest Algorithm in th

Irving Gómez-Méndez 1 Nov 15, 2021
This is 2nd term discrete maths project done by UCU students that uses backtracking to solve various problems.

Backtracking Project Sponsors This is a project made by UCU students: Olha Liuba - crossword solver implementation Hanna Yershova - sudoku solver impl

Dasha 4 Oct 17, 2021
This framework implements the data poisoning method found in the paper Adversarial Examples Make Strong Poisons

Adversarial poison generation and evaluation. This framework implements the data poisoning method found in the paper Adversarial Examples Make Strong

31 Nov 01, 2022
Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

El Bruno 3 Mar 30, 2022
Official repository for the NeurIPS 2021 paper Get Fooled for the Right Reason: Improving Adversarial Robustness through a Teacher-guided curriculum Learning Approach

Get Fooled for the Right Reason Official repository for the NeurIPS 2021 paper Get Fooled for the Right Reason: Improving Adversarial Robustness throu

Sowrya Gali 1 Apr 25, 2022