This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

Last update: Jan 03, 2023

Related tags

Deep Learning fine-grained-recognition

Overview

TransFG: A Transformer Architecture for Fine-grained Recognition

Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-grained Recognition

Implementation based on DeiT pretrained on ImageNet-1K with distillation fine-tuning will be released soon.

Framework

Dependencies:

Python 3.7.3
PyTorch 1.5.1
torchvision 0.6.1
ml_collections

Usage

1. Download Google pre-trained ViT models

Get models in this link: ViT-B_16, ViT-B_32...

wget https://storage.googleapis.com/vit_models/imagenet21k/{MODEL_NAME}.npz

2. Prepare data

In the paper, we use data from 5 publicly available datasets:

Please download them from the official websites and put them in the corresponding folders.

3. Install required packages

Install dependencies with the following command:

pip3 install -r requirements.txt

4. Train

To train TransFG on CUB-200-2011 dataset with 4 gpus in FP-16 mode for 10000 steps run:

CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m torch.distributed.launch --nproc_per_node=4 train.py --dataset CUB_200_2011 --split overlap --num_steps 10000 --fp16 --name sample_run

Citation

If you find our work helpful in your research, please cite it as:

@article{he2021transfg,
  title={TransFG: A Transformer Architecture for Fine-grained Recognition},
  author={He, Ju and Chen, Jieneng and Liu, Shuai and Kortylewski, Adam and Yang, Cheng and Bai, Yutong and Wang, Changhu and Yuille, Alan},
  journal={arXiv preprint arXiv:2103.07976},
  year={2021}
}

Acknowledgement

Many thanks to ViT-pytorch for the PyTorch reimplementation of An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

Related tags

Overview

TransFG: A Transformer Architecture for Fine-grained Recognition

Framework

Dependencies:

Usage

1. Download Google pre-trained ViT models

2. Prepare data

3. Install required packages

4. Train

Citation

Acknowledgement

Owner

Ju He

This thesis is mainly concerned with state-space methods for a class of deep Gaussian process (DGP) regression problems

Code accompanying "Adaptive Methods for Aggregated Domain Generalization"

Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation.

Nvdiffrast - Modular Primitives for High-Performance Differentiable Rendering

TensorFlow (Python) implementation of DeepTCN model for multivariate time series forecasting.

Modelisation on galaxy evolution using PEGASE-HR

Simple Linear 2nd ODE Solver GUI - A 2nd constant coefficient linear ODE solver with simple GUI using euler's method

PyTorch implementation of 'Gen-LaneNet: a generalized and scalable approach for 3D lane detection'

Referring Video Object Segmentation

Rendering Point Clouds with Compute Shaders

Learning embeddings for classification, retrieval and ranking.

GAT - Graph Attention Network (PyTorch) 💻 + graphs + 📣 = ❤️

Extremely simple and fast extreme multi-class and multi-label classifiers.

Embracing Single Stride 3D Object Detector with Sparse Transformer

OpenMMLab Image Classification Toolbox and Benchmark

Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization Algorithm,Immune Algorithm, Artificial Fish Swarm Algorithm, Differential Evolution and TSP(Traveling salesman)

Official Implementation for the "An Empirical Investigation of 3D Anomaly Detection and Segmentation" paper.

Implementation of Invariant Point Attention, used for coordinate refinement in the structure module of Alphafold2, as a standalone Pytorch module

ML models implementation practice

GLM (General Language Model)