Post-Training Quantization for Vision transformers.

Related tags

Deep LearningPTQ4ViT
Overview

PTQ4ViT

Post-Training Quantization Framework for Vision Transformers. We use the twin uniform quantization method to reduce the quantization error on these activation values. And we use a Hessian guided metric to evaluate different scaling factors, which improves the accuracy of calibration with a small cost. The quantized vision transformers (ViT, DeiT, and Swin) achieve near-lossless prediction accuracy (less than 0.5% drop at 8-bit quantization) on the ImageNet classification task. Please read the paper for details.

Install

Requirement

  • python>=3.5
  • pytorch>=1.5
  • matplotlib
  • pandas
  • timm

Datasets

To run example testing, you should put your ImageNet2012 dataset in path /datasets/imagenet.

We use ViTImageNetLoaderGenerator in utils/datasets.py to initialize our DataLoader. If your Imagenet datasets are stored elsewhere, you'll need to manually pass its root as an argument when instantiating a ViTImageNetLoaderGenerator.

Usage

1. Run example quantization

To test on all models with BasePTQ/PTQ4ViT, run

python example/test_all.py

To run ablation testing, run

python example/test_ablation.py

You can run the testing scripts with multiple GPUs. For example, calling

python example/test_all.py --multigpu --n_gpu 6

will use 6 gpus to run the test.

2. Download quantized model checkpoints

(Coming soon)

Results

Results of BasePTQ

model original w8a8 w6a6
ViT-S/224/32 75.99 73.61 60.144
ViT-S/224 81.39 80.468 70.244
ViT-B/224 84.54 83.896 75.668
ViT-B/384 86.00 85.352 46.886
DeiT-S/224 79.80 77.654 72.268
DeiT-B/224 81.80 80.946 78.786
DeiT-B/384 83.11 82.33 68.442
Swin-T/224 81.39 80.962 78.456
Swin-S/224 83.23 82.758 81.742
Swin-B/224 85.27 84.792 83.354
Swin-B/384 86.44 86.168 85.226

Results of PTQ4ViT

model original w8a8 w6a6
ViT-S/224/32 75.99 75.582 71.908
ViT-S/224 81.39 81.002 78.63
ViT-B/224 84.54 84.25 81.65
ViT-B/384 86.00 85.828 83.348
DeiT-S/224 79.80 79.474 76.282
DeiT-B/224 81.80 81.482 80.25
DeiT-B/384 83.11 82.974 81.55
Swin-T/224 81.39 81.246 80.47
Swin-S/224 83.23 83.106 82.38
Swin-B/224 85.27 85.146 84.012
Swin-B/384 86.44 86.394 85.388

Results of Ablation

  • ViT-S/224 (original top-1 accuracy 81.39%)
Hessian Guided Softmax Twin GELU Twin W8A8 W6A6
80.47 70.24
80.93 77.20
81.11 78.57
80.84 76.93
79.25 74.07
81.00 78.63
  • ViT-B/224 (original top-1 accuracy 84.54%)
Hessian Guided Softmax Twin GELU Twin W8A8 W6A6
83.90 75.67
83.97 79.90
84.07 80.76
84.10 80.82
83.40 78.86
84.25 81.65
  • ViT-B/384 (original top-1 accuracy 86.00%)
Hessian Guided Softmax Twin GELU Twin W8A8 W6A6
85.35 46.89
85.42 79.99
85.67 82.01
85.60 82.21
84.35 80.86
85.89 83.19

Citation

@article{PTQ4ViT_cvpr2022,
    title={PTQ4ViT: Post-Training Quantization Framework for Vision Transformers},
    author={Zhihang Yuan, Chenhao Xue, Yiqi Chen, Qiang Wu, Guangyu Sun},
    journal={arXiv preprint arXiv:2111.12293},
    year={2022},
}
Owner
Zhihang Yuan
Zhihang Yuan
Implement A3C for Mujoco gym envs

pytorch-a3c-mujoco Disclaimer: my implementation right now is unstable (you ca refer to the learning curve below), I'm not sure if it's my problems. A

Andrew 70 Dec 12, 2022
AttGAN: Facial Attribute Editing by Only Changing What You Want (IEEE TIP 2019)

News 11 Jan 2020: We clean up the code to make it more readable! The old version is here: v1. AttGAN TIP Nov. 2019, arXiv Nov. 2017 TensorFlow impleme

Zhenliang He 568 Dec 14, 2022
Framework for training options with different attention mechanism and using them to solve downstream tasks.

Using Attention in HRL Framework for training options with different attention mechanism and using them to solve downstream tasks. Requirements GPU re

5 Nov 03, 2022
Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face Manipulation" published in CVPR 2020.

FFD Source Code Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face M

88 Nov 22, 2022
Distinguishing Commercial from Editorial Content in News

Distinguishing Commercial from Editorial Content in News In this repository you can find the following: An anonymized version of the data used for my

Timo Kats 3 Sep 26, 2022
The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021] Release Notes The offical PyTorch implementation of NeMo, p

Angtian Wang 76 Nov 23, 2022
deep learning model that learns to code with drawing in the Processing language

sketchnet sketchnet - processing code generator can we teach a computer to draw pictures with code. We use Processing and java/jruby code paired with

41 Dec 12, 2022
MoveNet Single Pose on DepthAI

MoveNet Single Pose tracking on DepthAI Running Google MoveNet Single Pose models on DepthAI hardware (OAK-1, OAK-D,...). A convolutional neural netwo

64 Dec 29, 2022
Implementation of ToeplitzLDA for spatiotemporal stationary time series data.

Code for the ToeplitzLDA classifier proposed in here. The classifier conforms sklearn and can be used as a drop-in replacement for other LDA classifiers. For in-depth usage refer to the learning from

Jan Sosulski 5 Nov 07, 2022
Diverse Image Generation via Self-Conditioned GANs

Diverse Image Generation via Self-Conditioned GANs Project | Paper Diverse Image Generation via Self-Conditioned GANs Steven Liu, Tongzhou Wang, David

Steven Liu 147 Dec 03, 2022
Joint Unsupervised Learning (JULE) of Deep Representations and Image Clusters.

Joint Unsupervised Learning (JULE) of Deep Representations and Image Clusters. Overview This project is a Torch implementation for our CVPR 2016 paper

Jianwei Yang 278 Dec 25, 2022
Many Class Activation Map methods implemented in Pytorch for CNNs and Vision Transformers. Including Grad-CAM, Grad-CAM++, Score-CAM, Ablation-CAM and XGrad-CAM

Class Activation Map methods implemented in Pytorch pip install grad-cam ⭐ Tested on many Common CNN Networks and Vision Transformers. ⭐ Includes smoo

Jacob Gildenblat 6.6k Jan 06, 2023
FedML: A Research Library and Benchmark for Federated Machine Learning

FedML: A Research Library and Benchmark for Federated Machine Learning 📄 https://arxiv.org/abs/2007.13518 News 2021-02-01 (Award): #NeurIPS 2020# Fed

FedML-AI 2.3k Jan 08, 2023
Minimal But Practical Image Classifier Pipline Using Pytorch, Finetune on ResNet18, Got 99% Accuracy on Own Small Datasets.

PyTorch Image Classifier Updates As for many users request, I released a new version of standared pytorch immage classification example at here: http:

JinTian 106 Nov 06, 2022
This repository comes with the paper "On the Robustness of Counterfactual Explanations to Adverse Perturbations"

Robust Counterfactual Explanations This repository comes with the paper "On the Robustness of Counterfactual Explanations to Adverse Perturbations". I

Marco 5 Dec 20, 2022
Streamlit Tutorial (ex: stock price dashboard, cartoon-stylegan, vqgan-clip, stylemixing, styleclip, sefa)

Streamlit Tutorials Install pip install streamlit Run cd [directory] streamlit run app.py --server.address 0.0.0.0 --server.port [your port] # http:/

Jihye Back 30 Jan 06, 2023
Demo for the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"

Streaming speaker diarization Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation by Juan Manuel Coria, Hervé

Juanma Coria 187 Jan 06, 2023
bio_inspired_min_nets_improve_the_performance_and_robustness_of_deep_networks

Code Submission for: Bio-inspired Min-Nets Improve the Performance and Robustness of Deep Networks Run with docker To build a docker environment, chan

0 Dec 09, 2021
[ACMMM 2021, Oral] Code release for "Elastic Tactile Simulation Towards Tactile-Visual Perception"

EIP: Elastic Interaction of Particles Code release for "Elastic Tactile Simulation Towards Tactile-Visual Perception", in ACMMM (Oral) 2021. By Yikai

Yikai Wang 37 Dec 20, 2022
Config files for my GitHub profile.

Canalyst Candas Data Science Library Name Canalyst Candas Description Built by a former PM / analyst to give anyone with a little bit of Python knowle

Canalyst Candas 13 Jun 24, 2022