Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(2021) paper

Last update: Jan 02, 2023

Overview

ImageNet-21K Pretraining for the Masses

Official PyTorch Implementation

Tal Ridnik, Emanuel Ben-Baruch, Asaf Noy, Lihi Zelnik-Manor
DAMO Academy, Alibaba Group

Abstract

ImageNet-1K serves as the primary dataset for pretraining deep learning models for computer vision tasks. ImageNet-21K dataset, which contains more pictures and classes, is used less frequently for pretraining, mainly due to its complexity, and underestimation of its added value compared to standard ImageNet-1K pretraining. This paper aims to close this gap, and make high-quality efficient pretraining on ImageNet-21K available for everyone. Via a dedicated preprocessing stage, utilizing WordNet hierarchies, and a novel training scheme called semantic softmax, we show that different models, including small mobile-oriented models, significantly benefit from ImageNet-21K pretraining on numerous datasets and tasks. We also show that we outperform previous ImageNet-21K pretraining schemes for prominent new models like ViT. Our proposed pretraining pipeline is efficient, accessible, and leads to SoTA reproducible results, from a publicly available dataset.

Getting Started

Note - repo under construction, more contetnt will be added.

(1) Pretrained Models on ImageNet-21K-P Dataset

Backbone	ImageNet-21K-P semantic top-1 Accuracy [%]	ImageNet-1K top-1 Accuracy [%]	Maximal batch size	Maximal training speed (img/sec)	Maximal inference speed (img/sec)
MobilenetV3_large_100	73.1	78.0	488	1210	5980
Ofa_flops_595m_s	75.0	81.0	288	500	3240
ResNet50	75.6	82.0	320	720	2760
TResNet-M	76.4	83.1	520	670	2970
TResNet-L (V2)	76.7	83.9	240	300	1460
ViT_base_patch16_224	77.6	84.4	160	340	1140

See this link for more details.
We highly recommend to start working with ImageNet-21K by testing these weights against standard ImageNet-1K pretraining, and comparing results on your relevant downstream tasks. After you will see a significant improvement (you will), proceed to pretraining new models.

(2) Obtaining and Processing the Dataset

See instructions for obtaining and processing the dataset in here.

(3) Training Code

To use the traing code, first download ImageNet-21K-P semantic tree to your local ./resources/ folder Example of semantic softmax training:

python train_semantic_softmax.py \
--batch_size=4 \
--data_path=/mnt/datasets/21k \
--model_name=mobilenetv3_large_100 \
--model_path=/mnt/models/mobilenetv3_large_100.pth \
--epochs=80

For shortening the training, we initialize the weights from standard ImageNet-1K. Recommended to use ImageNet-1K weights from this excellent repo.

To be added soon

KD training code
Inference code
Model weights after transferred to ImageNet-1K
More...

Citation

@misc{ridnik2021imagenet21k,
      title={ImageNet-21K Pretraining for the Masses}, 
      author={Tal Ridnik and Emanuel Ben-Baruch and Asaf Noy and Lihi Zelnik-Manor},
      year={2021},
      eprint={2104.10972},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(2021) paper

Related tags

Overview

ImageNet-21K Pretraining for the Masses

Getting Started

(1) Pretrained Models on ImageNet-21K-P Dataset

(2) Obtaining and Processing the Dataset

(3) Training Code

To be added soon

Citation

Owner

StyleGAN2-ADA - Official PyTorch implementation

Stock-history-display - something like a easy yearly review for your stock performance

Code, Data and Demo for Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting

TensorFlow Tutorials with YouTube Videos

Learning Saliency Propagation for Semi-supervised Instance Segmentation

BERT model training impelmentation using 1024 A100 GPUs for MLPerf Training v1.1

Clean Machine Learning, a Coding Kata

Rename Images with Auto Generated Neural Image Captions

PyTorch implementation of MSBG hearing loss model and MBSTOI intelligibility metric

Anatomy of Matplotlib -- tutorial developed for the SciPy conference

Cosine Annealing With Warmup

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park, Rares Ambrus, Vitor Guizilini, Jie Li, and Adrien Gaidon.

Interpretable-contrastive-word-mover-s-embedding

Suite of 500 procedurally-generated NLP tasks to study language model adaptability

Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.

Intent parsing and slot filling in PyTorch with seq2seq + attention

I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)

An all-in-one application to visualize multiple different local path planning algorithms

Official repository for the NeurIPS 2021 paper Get Fooled for the Right Reason: Improving Adversarial Robustness through a Teacher-guided curriculum Learning Approach

Attack on Confidence Estimation algorithm from the paper "Disrupting Deep Uncertainty Estimation Without Harming Accuracy"

Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(2021) paper

Related tags

Overview

ImageNet-21K Pretraining for the Masses

Getting Started

(1) Pretrained Models on ImageNet-21K-P Dataset

(2) Obtaining and Processing the Dataset

(3) Training Code

To be added soon

Citation

Owner

StyleGAN2-ADA - Official PyTorch implementation

Stock-history-display - something like a easy yearly review for your stock performance

Code, Data and Demo for Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting

TensorFlow Tutorials with YouTube Videos

Learning Saliency Propagation for Semi-supervised Instance Segmentation

BERT model training impelmentation using 1024 A100 GPUs for MLPerf Training v1.1

Clean Machine Learning, a Coding Kata

Rename Images with Auto Generated Neural Image Captions

PyTorch implementation of MSBG hearing loss model and MBSTOI intelligibility metric

Anatomy of Matplotlib -- tutorial developed for the SciPy conference

Cosine Annealing With Warmup

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park*, Rares Ambrus*, Vitor Guizilini, Jie Li, and Adrien Gaidon.

Interpretable-contrastive-word-mover-s-embedding

Suite of 500 procedurally-generated NLP tasks to study language model adaptability

Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.

Intent parsing and slot filling in PyTorch with seq2seq + attention

I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)

An all-in-one application to visualize multiple different local path planning algorithms

Official repository for the NeurIPS 2021 paper Get Fooled for the Right Reason: Improving Adversarial Robustness through a Teacher-guided curriculum Learning Approach

Attack on Confidence Estimation algorithm from the paper "Disrupting Deep Uncertainty Estimation Without Harming Accuracy"

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park, Rares Ambrus, Vitor Guizilini, Jie Li, and Adrien Gaidon.