Main Results on ImageNet with Pretrained Models

Last update: Dec 14, 2022

Related tags

Overview

This repository contains Pytorch evaluation code, training code and pretrained models for the following projects:

SPACH (A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP)
sMLP (Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?)
ShiftViT (When Shift Operation Meets Vision Transformer: An Extremely Simple Alternative to Attention Mechanism)

Main Results on ImageNet with Pretrained Models

name	[email protected]	#params	FLOPs	url
SPACH-Conv-MS-S	81.6	44M	7.2G	github
SPACH-Trans-MS-S	82.9	40M	7.6G	github
SPACH-MLP-MS-S	82.1	46M	8.2G	github
SPACH-Hybrid-MS-S	83.7	63M	11.2G	github
SPACH-Hybrid-MS-S+	83.9	63M	12.3G	github
sMLPNet-T	81.9	24M	5.0G
sMLPNet-S	83.1	49M	10.3G	github
sMLPNet-B	83.4	66M	14.0G	github
Shift-T / light	79.4	20M	3.0G	github
Shift-T	81.7	29M	4.5G	github
Shift-S / light	81.6	34M	5.7G	github
Shift-S	82.8	50M	8.8G	github

Usage

Install

First, clone the repo and install requirements:

git clone https://github.com/microsoft/Spach
pip install -r requirements.txt

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val/ folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

Evaluation

To evaluate a pre-trained model on ImageNet val with a single GPU run:

python main.py --eval --resume <checkpoint> --model <model-name>--data-path <imagenet-path>

For example, to evaluate the SPACH-Hybrid-MS-S model, run

python main.py --eval --resume --model spach_ms_s_patch4_224_hybrid spach_ms_hybrid_s.pth --data-path <imagenet-path>

giving

* [email protected] 83.658 [email protected] 96.762 loss 0.688

You can find all supported models in models/registry.py.

Training

One can simply call the following script to run training process. Distributed training is recommended even on single GPU node.

python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use> --use_env main.py \
--model <model-name>
--data-path <imagenet-path>
--output_dir <output-path>
--dist-eval

Citation

@article{zhao2021battle,
  title={A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP},
  author={Zhao, Yucheng and Wang, Guangting and Tang, Chuanxin and Luo, Chong and Zeng, Wenjun and Zha, Zheng-Jun},
  journal={arXiv preprint arXiv:2108.13002},
  year={2021}
}

@article{tang2021sparse,
  title={Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?},
  author={Tang, Chuanxin and Zhao, Yucheng and Wang, Guangting and Luo, Chong and Xie, Wenxuan and Zeng, Wenjun},
  journal={arXiv preprint arXiv:2109.05422},
  year={2021}
}

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Acknowledgement

Our code are built on top of DeiT. We test throughput following Swin Transformer

You might also like...

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

1.4k Jan 1, 2023

Measuring and Improving Consistency in Pretrained Language Models

ParaRel 🤘 This repository contains the code and data for the paper: Measuring and Improving Consistency in Pretrained Language Models as well as the

26 Dec 2, 2022

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

This repository is a toolkit to do machine learning for programming languages. It implements tokenization, dataset preprocessing, model training and m

408 Jan 1, 2023

A library for finding knowledge neurons in pretrained transformer models.

knowledge-neurons An open source repository replicating the 2021 paper Knowledge Neurons in Pretrained Transformers by Dai et al., and extending the t

96 Dec 21, 2022

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

VisualGPT Our Paper VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning Main Architecture of Our VisualGPT Downloa

140 Dec 28, 2022

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research int

73 Dec 16, 2022

Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Music Source Separation with Channel-wise Subband Phase Aware ResUnet (CWS-PResUNet) Introduction This repo contains the pretrained Music Source Separ

100 Dec 25, 2022

Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)

This is a playground for pytorch beginners, which contains predefined models on popular dataset. Currently we support mnist, svhn cifar10, cifar100 st

2.4k Dec 28, 2022

Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come

Pytorch Squeeznet Pytorch implementation of Squeezenet model as described in https://arxiv.org/abs/1602.07360 on cifar-10 Data. The definition of Sque

86 Oct 28, 2022

Comments

Shift features implementation

Hi, very interesting research. I wonder why did you implement the shift_feature as memory copy https://github.com/microsoft/SPACH/blob/497c1d86fffd9d48e26c0484fb845ff04c328cca/models/shiftvit.py#L107 instead of using Tensor.roll operation? It would make your block much faster. Another benefit would be that pixels from one side would leak to the other giving the network to pass information from one boundary to another, which seems a better option that dublication of the last row during each shift.

opened by bonlime 3
Add: unofficial implementation

Hey folks,

It would be great if this repository could also hold links for other unofficial implementations. I am proposing a keras tutorial on ShiftViT.

opened by ariG23498 0
The configuration of the architecture variants is inconsistent with the papers and weights files.

@tangchuanxin

https://github.com/microsoft/SPACH/blob/497c1d86fffd9d48e26c0484fb845ff04c328cca/models/registry.py#L224

The code is inconsistent with the content of the paper:

and the weight file. The content of this pth file is the same as the architecture variant -S in the figure above, ie, depths=(6, 8, 18, 6).

https://github.com/microsoft/SPACH/releases/download/v1.0/shiftvit_tiny_r2.pth

opened by lartpang 1

Main Results on ImageNet with Pretrained Models

Related tags

Overview

Main Results on ImageNet with Pretrained Models

Usage

Install

Data preparation

Evaluation

Training

Citation

Contributing

Acknowledgement

You might also like...

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Measuring and Improving Consistency in Pretrained Language Models

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

A library for finding knowledge neurons in pretrained transformer models.

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)

Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come

Comments

Shift features implementation

Add: unofficial implementation

The configuration of the architecture variants is inconsistent with the papers and weights files.

Releases(v1.0)

v1.0(Nov 19, 2021)

Owner

Microsoft

Official PyTorch Implementation of paper "NeLF: Neural Light-transport Field for Single Portrait View Synthesis and Relighting", EGSR 2021.

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Face Recognition & AI Based Smart Attendance Monitoring System.

Embodied Intelligence via Learning and Evolution

Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

Explainer for black box models that predict molecule properties

A PyTorch implementation of Radio Transformer Networks from the paper "An Introduction to Deep Learning for the Physical Layer".

3DIAS: 3D Shape Reconstruction with Implicit Algebraic Surfaces (ICCV 2021)

The official code repository for examples in the O'Reilly book 'Generative Deep Learning'

Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)"

Boundary-aware Transformers for Skin Lesion Segmentation

Systematic generalisation with group invariant predictions

OpenMMLab Semantic Segmentation Toolbox and Benchmark.

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

EmoTag helps you train emotion detection model for Chinese audios

Code for the ICML 2021 paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild

Awesome Long-Tailed Learning

Behind the Curtain: Learning Occluded Shapes for 3D Object Detection

A curated (most recent) list of resources for Learning with Noisy Labels