Implementation of Convolutional enhanced image Transformer

Last update: Dec 13, 2022

Overview

CeiT : Convolutional enhanced image Transformer

This is an unofficial PyTorch implementation of Incorporating Convolution Designs into Visual Transformers .

Training :

python train.py -c configs/default.yaml --name "name_of_exp"

Usage :

import torch
from ceit import CeiT

img = torch.ones([1, 3, 224, 224])
    
model = CeiT(image_size = 224, patch_size = 4, num_classes = 100)
out = model(img)

print("Shape of out :", out.shape)      # [B, num_classes]

model = CeiT(image_size = 224, patch_size = 4, num_classes = 100, with_lca = True)
out = model(img)

print("Shape of out :", out.shape)      # [B, num_classes]

Note :

LCA might not be properly implemented.

Citation :

@misc{yuan2021incorporating,
      title={Incorporating Convolution Designs into Visual Transformers}, 
      author={Kun Yuan and Shaopeng Guo and Ziwei Liu and Aojun Zhou and Fengwei Yu and Wei Wu},
      year={2021},
      eprint={2103.11816},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement :

Base ViT code is borrowed from @lucidrains repo : https://github.com/lucidrains/vit-pytorch
Training and dataloader code is borrowed from @jeonsworld repo : https://github.com/jeonsworld/ViT-pytorch

Implementation of Convolutional enhanced image Transformer

Related tags

Overview

CeiT : Convolutional enhanced image Transformer

Training :

Usage :

Note :

Citation :

Acknowledgement :

Owner

Rishikesh (ऋषिकेश)

Implementation for On Provable Benefits of Depth in Training Graph Convolutional Networks

Classical OCR DCNN reproduction based on PaddlePaddle framework.

How to Become More Salient? Surfacing Representation Biases of the Saliency Prediction Model

Code release for NeurIPS 2020 paper "Co-Tuning for Transfer Learning"

AgeGuesser: deep learning based age estimation system. Powered by EfficientNet and Yolov5

Algorithmic encoding of protected characteristics and its implications on disparities across subgroups

A note taker for NVDA. Allows the user to create, edit, view, manage and export notes to different formats.

natural image generation using ConvNets

Julia and Matlab codes to simulated all problems in El-Hachem, McCue and Simpson (2021)

This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

This repo contains the implementation of YOLOv2 in Keras with Tensorflow backend.

Air Quality Prediction Using LSTM

JudeasRx - graphical app for doing personalized causal medicine using the methods invented by Judea Pearl et al.

Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis

Enabling dynamic analysis of Legacy Embedded Systems in full emulated environment

FairyTailor: Multimodal Generative Framework for Storytelling

Hierarchical probabilistic 3D U-Net, with attention mechanisms (—𝘈𝘵𝘵𝘦𝘯𝘵𝘪𝘰𝘯 𝘜-𝘕𝘦𝘵, 𝘚𝘌𝘙𝘦𝘴𝘕𝘦𝘵) and a nested decoder structure with deep supervision (—𝘜𝘕𝘦𝘵++).

Learning Lightweight Low-Light Enhancement Network using Pseudo Well-Exposed Images

The official codes of "Semi-supervised Models are Strong Unsupervised Domain Adaptation Learners".

Pipeline code for Sequential-GAM(Genome Architecture Mapping).