Dynamic Token Normalization Improves Vision Transformers

Last update: Oct 09, 2022

Related tags

Overview

Dynamic Token Normalization Improves Vision Transformers

This is the PyTorch implementation of the paper Dynamic Token Normalization Improves Vision Transfromers. Codea and Models will be available soon.

Dynamic Token Normalization

We design a novel normalization method, termed Dynamic Token Normalization (DTN), which inherits the advantages from LayerNorm and InstanceNorm. DTN can be seamlessly plugged into various transformer models, consistenly improving the performance.

Comparisons of top-1 accuracies on the validation set of ImageNet, by using ViT trained with LN and DTN.

Model	Top-1	Top-5
ViT-T*-LN	72.3	91.4
ViT-T*-DTN	73.2	91.7
ViT-S*-LN	80.6	95.2
ViT-S*-DTN	81.7	95.8
ViT-B*-LN	81.7	95.8
ViT-B*-DTN	82.5	96.1

Getting Started

Install PyTorch

Clone the repo:

git clone https://github.com/dtn-anonymous/DTN.git

Requirements

Install CUDA==10.1 with cudnn7 following the official installation instructions
Install PyTorch==1.7.1 and torchvision==0.8.2 with CUDA==10.1:

conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.1 -c pytorch

Install timm==0.3.2:

pip install timm==0.3.2

Data Preparation

Download the ImageNet dataset which should contain train and val directionary and the txt file for correspondings between images and labels.

Training a model from scratch

An example to train our DTN is given in DTN/scripts/train.sh. To train ViT-S* with our DTN,

cd DTN/scripts   
sh train.sh layer vit_norm_s_star configs/ViT/vit.yaml

Number of GPUs and configuration file to use can be modified in train.sh

Dynamic Token Normalization Improves Vision Transformers

Related tags

Overview

Dynamic Token Normalization Improves Vision Transformers

Dynamic Token Normalization

Getting Started

Requirements

Data Preparation

Training a model from scratch

Owner

Wenqi Shao

Neural Caption Generator with Attention

The King is Naked: on the Notion of Robustness for Natural Language Processing

PyTorch implementation of TSception V2 using DEAP dataset

Code for How To Create A Fully Automated AI Based Trading System With Python

Learning to Prompt for Continual Learning

Everything you want about DP-Based Federated Learning, including Papers and Code. (Mechanism: Laplace or Gaussian, Dataset: femnist, shakespeare, mnist, cifar-10 and fashion-mnist. )

Repo for the Tutorials of Day1-Day3 of the Nordic Probabilistic AI School 2021 (https://probabilistic.ai/)

NeuroFind - A solution to the to the Task given by the Oberseminar of Messtechnik Institute of TU Dresden in 2021

Official implementation for the paper: Generating Smooth Pose Sequences for Diverse Human Motion Prediction

Proposed n-stage Latent Dirichlet Allocation method - A Novel Approach for LDA

This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.

Official implementation of "Learning Proposals for Practical Energy-Based Regression", 2021.

"Projelerle Yapay Zeka Ve Bilgisayarlı Görü" Kitabımın projeleri

Retinal vessel segmentation based on GT-UNet

Object detection, 3D detection, and pose estimation using center point detection:

One Million Scenes for Autonomous Driving

Code & Models for 3DETR - an End-to-end transformer model for 3D object detection

Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

TensorFlow (Python) implementation of DeepTCN model for multivariate time series forecasting.

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit