Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Last update: Jan 01, 2023

Related tags

Deep Learning PS-ViT

Overview

Vision Transformer with Progressive Sampling

This is the official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Installation Instructions

Clone this repo:

git clone [email protected]:yuexy/PS-ViT.git
cd PS-ViT

Create a conda virtual environment and activate it:

conda create -n ps_vit python=3.7 -y
conda activate ps_vit

Install CUDA==10.1 with cudnn7 following the official installation instructions
Install PyTorch==1.7.1 and torchvision==0.8.2 with CUDA==10.1:

conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.1 -c pytorch

Install timm==0.3.4, einops, pyyaml:

pip3 install timm=0.3.4, einops, pyyaml

Install Apex:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Install PS-ViT:

python setup.py build_ext --inplace

Results and Models

All models listed below are evaluated with input size 224x224

Model	Top1 Acc	#params	FLOPS	Download
PS-ViT-Ti/14	75.6	4.8M	1.6G	Coming Soon
PS-ViT-B/10	80.6	21.3M	3.1G	Coming Soon
PS-ViT-B/14	81.7	21.3M	5.4G	Google Drive
PS-ViT-B/18	82.3	21.3M	8.8G	Google Drive

Evaluation

To evaluate a pre-trained PS-ViT on ImageNet val, run:

python3 main.py <data-root> --model <model-name> -b <batch-size> --eval_checkpoint <path-to-checkpoint>

Training from scratch

To train a PS-ViT on ImageNet from scratch, run:

bash ./scripts/train_distributed.sh <job-name> <config-path> <num-gpus>

Citing PS-ViT

@article{psvit,
  title={Vision Transformer with Progressive Sampling},
  author={Yue, Xiaoyu and Sun, Shuyang and Kuang, Zhanghui and Wei, Meng and Torr, Philip and Zhang, Wayne and Lin, Dahua},
  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2021}
}

Contact

If you have any questions, don't hesitate to contact Xiaoyu Yue. You can easily reach him by sending an email to [email protected].

Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Related tags

Overview

Vision Transformer with Progressive Sampling

Installation Instructions

Results and Models

Evaluation

Training from scratch

Citing PS-ViT

Contact

Owner

yuexy

Official PyTorch implementation of UACANet: Uncertainty Aware Context Attention for Polyp Segmentation

Code for EMNLP2021 paper "Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training"

Spatially-Adaptive Pixelwise Networks for Fast Image Translation, CVPR 2021

Official implement of "CAT: Cross Attention in Vision Transformer".

This is a collection of our NAS and Vision Transformer work.

A python interface for training Reinforcement Learning bots to battle on pokemon showdown

Learning Chinese Character style with conditional GAN

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

The official PyTorch code for NeurIPS 2021 ML4AD Paper, "Does Thermal data make the detection systems more reliable?"

Delving into Localization Errors for Monocular 3D Object Detection, CVPR'2021

Official implementation for the paper: Permutation Invariant Graph Generation via Score-Based Generative Modeling

Fine-Tune EleutherAI GPT-Neo to Generate Netflix Movie Descriptions in Only 47 Lines of Code Using Hugginface And DeepSpeed

DPC: Unsupervised Deep Point Correspondence via Cross and Self Construction (3DV 2021)

ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

Pytorch implementation of YOLOX、PPYOLO、PPYOLOv2、FCOS an so on.

Code to reproduce the results in "Visually Grounded Reasoning across Languages and Cultures", EMNLP 2021.

PyTorch implemention of ICCV'21 paper SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation

Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning

This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

Image Segmentation Evaluation