Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning

Related tags

Deep LearningCARE
Overview

Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning

This repository is the official implementation of CARE. Graph

Updates

  • (09/10/2021) Our paper is accepted by NeurIPS 2021.

Requirements

To install requirements:

conda create -n care python=3.6
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch
pip install tensorboard
pip install ipdb
pip install einops
pip install loguru
pip install pyarrow==3.0.0
pip install tqdm

๐Ÿ“‹ Pytorch>=1.6 is needed for runing the code.

Data Preparation

Prepare the ImageNet data in {data_path}/train.lmdb and {data_path}/val.lmdb

Relpace the original data path in care/data/dataset_lmdb (Line7 and Line40) with your new {data_path}.

๐Ÿ“‹ Note that we use the lmdb file to speed-up the data-processing procedure.

Training

Before training the ResNet-50 (100 epoch) in the paper, run this command first to add your PYTHONPATH:

export PYTHONPATH=$PYTHONPATH:{your_code_path}/care/
export PYTHONPATH=$PYTHONPATH:{your_code_path}/care/care/

Then run the training code via:

bash run_train.sh      #(The training script is used for trianing CARE with 8 gpus)
bash single_gpu_train.sh    #(We also provide the script for trainig CARE with only one gpu)

๐Ÿ“‹ The training script is used to do unsupervised pre-training of a ResNet-50 model on ImageNet in an 8-gpu machine

  1. using -b to specify batch_size, e.g., -b 128
  2. using -d to specify gpu_id for training, e.g., -d 0-7
  3. using --log_path to specify the main folder for saving experimental results.
  4. using --experiment-name to specify the folder for saving training outputs.

The code base also supports for training other backbones (e.g., ResNet101 and ResNet152) with different training schedules (e.g., 200, 400 and 800 epochs).

Evaluation

Before start the evaluation, run this command first to add your PYTHONPATH:

export PYTHONPATH=$PYTHONPATH:{your_code_path}/care/
export PYTHONPATH=$PYTHONPATH:{your_code_path}/care/care/

Then, to evaluate the pre-trained model (e.g., ResNet50-100epoch) on ImageNet, run:

bash run_val.sh      #(The training script is used for evaluating CARE with 8 gpus)
bash debug_val.sh    #(We also provide the script for evaluating CARE with only one gpu)

๐Ÿ“‹ The training script is used to do the supervised linear evaluation of a ResNet-50 model on ImageNet in an 8-gpu machine

  1. using -b to specify batch_size, e.g., -b 128
  2. using -d to specify gpu_id for training, e.g., -d 0-7
  3. Modifying --log_path according to your own config.
  4. Modifying --experiment-name according to your own config.

Pre-trained Models

We here provide some pre-trained models in the [shared folder]:

Here are some examples.

  • [ResNet-50 100epoch] trained on ImageNet using ResNet-50 with 100 epochs.
  • [ResNet-50 200epoch] trained on ImageNet using ResNet-50 with 200 epochs.
  • [ResNet-50 400epoch] trained on ImageNet using ResNet-50 with 400 epochs.

More models are provided in the following model zoo part.

๐Ÿ“‹ We will provide more pretrained models in the future.

Model Zoo

Our model achieves the following performance on :

Self-supervised learning on image classifications.

Method Backbone epoch Top-1 Top-5 pretrained model linear evaluation model
CARE ResNet50 100 72.02% 90.02% [pretrained] (wip) [linear_model] (wip)
CARE ResNet50 200 73.78% 91.50% [pretrained] (wip) [linear_model] (wip)
CARE ResNet50 400 74.68% 91.97% [pretrained] (wip) [linear_model] (wip)
CARE ResNet50 800 75.56% 92.32% [pretrained] (wip) [linear_model] (wip)
CARE ResNet50(2x) 100 73.51% 91.66% [pretrained] (wip) [linear_model] (wip)
CARE ResNet50(2x) 200 75.00% 92.22% [pretrained] (wip) [linear_model] (wip)
CARE ResNet50(2x) 400 76.48% 92.99% [pretrained] (wip) [linear_model] (wip)
CARE ResNet50(2x) 800 77.04% 93.22% [pretrained] (wip) [linear_model] (wip)
CARE ResNet101 100 73.54% 91.63% [pretrained] (wip) [linear_model] (wip)
CARE ResNet101 200 75.89% 92.70% [pretrained] (wip) [linear_model] (wip)
CARE ResNet101 400 76.85% 93.31% [pretrained] (wip) [linear_model] (wip)
CARE ResNet101 800 77.23% 93.52% [pretrained] (wip) [linear_model] (wip)
CARE ResNet152 100 74.59% 92.09% [pretrained] (wip) [linear_model] (wip)
CARE ResNet152 200 76.58% 93.63% [pretrained] (wip) [linear_model] (wip)
CARE ResNet152 400 77.40% 93.63% [pretrained] (wip) [linear_model] (wip)
CARE ResNet152 800 78.11% 93.81% [pretrained] (wip) [linear_model] (wip)

Transfer learning to object detection and semantic segmentation.

COCO det

Method Backbone epoch AP_bb AP_50 AP_75 pretrained model det/seg model
CARE ResNet50 200 39.4 59.2 42.6 [pretrained] (wip) [model] (wip)
CARE ResNet50 400 39.6 59.4 42.9 [pretrained] (wip) [model] (wip)
CARE ResNet50-FPN 200 39.5 60.2 43.1 [pretrained] (wip) [model] (wip)
CARE ResNet50-FPN 400 39.8 60.5 43.5 [pretrained] (wip) [model] (wip)

COCO instance seg

Method Backbone epoch AP_mk AP_50 AP_75 pretrained model det/seg model
CARE ResNet50 200 34.6 56.1 36.8 [pretrained] (wip) [model] (wip)
CARE ResNet50 400 34.7 56.1 36.9 [pretrained] (wip) [model] (wip)
CARE ResNet50-FPN 200 35.9 57.2 38.5 [pretrained] (wip) [model] (wip)
CARE ResNet50-FPN 400 36.2 57.4 38.8 [pretrained] (wip) [model] (wip)

VOC07+12 det

Method Backbone epoch AP_bb AP_50 AP_75 pretrained model det/seg model
CARE ResNet50 200 57.7 83.0 64.5 [pretrained] (wip) [model] (wip)
CARE ResNet50 400 57.9 83.0 64.7 [pretrained] (wip) [model] (wip)

๐Ÿ“‹ More results are provided in the paper.

Contributing

๐Ÿ“‹ WIP

Owner
ChongjianGE
๐ŸŽฏ PhD in Computer Vision โ˜‘๏ธ MSc & BEng in Electrical Engineering
ChongjianGE
Normalizing Flows with a resampled base distribution

Resampling Base Distributions of Normalizing Flows Normalizing flows are a popular class of models for approximating probability distributions. Howeve

Vincent Stimper 24 Nov 03, 2022
Unofficial PyTorch implementation of SimCLR by Google Brain

Unofficial PyTorch implementation of SimCLR by Google Brain

Rishabh Anand 2 Oct 13, 2021
Deep learning with TensorFlow and earth observation data.

Deep Learning with TensorFlow and EO Data Complete file set for Jupyter Book Autor: Development Seed Date: 04 October 2021 ISBN: (to come) Notebook tu

Development Seed 20 Nov 16, 2022
Official git for "CTAB-GAN: Effective Table Data Synthesizing"

CTAB-GAN This is the official git paper CTAB-GAN: Effective Table Data Synthesizing. The paper is published on Asian Conference on Machine Learning (A

30 Dec 26, 2022
FAST-RIR: FAST NEURAL DIFFUSE ROOM IMPULSE RESPONSE GENERATOR

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

Anton Jeran Ratnarajah 89 Dec 22, 2022
ICON: Implicit Clothed humans Obtained from Normals

ICON: Implicit Clothed humans Obtained from Normals arXiv, December 2021. Yuliang Xiu ยท Jinlong Yang ยท Dimitrios Tzionas ยท Michael J. Black Table of C

Yuliang Xiu 1.1k Dec 30, 2022
Fast SHAP value computation for interpreting tree-based models

FastTreeSHAP FastTreeSHAP package is built based on the paper Fast TreeSHAP: Accelerating SHAP Value Computation for Trees published in NeurIPS 2021 X

LinkedIn 369 Jan 04, 2023
A generalist algorithm for cell and nucleus segmentation.

Cellpose | A generalist algorithm for cell and nucleus segmentation. Cellpose was written by Carsen Stringer and Marius Pachitariu. To learn about Cel

MouseLand 733 Dec 29, 2022
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation. Training python train.py --c

Rishikesh (เค‹เคทเคฟเค•เฅ‡เคถ) 55 Dec 26, 2022
Python package to generate image embeddings with CLIP without PyTorch/TensorFlow

imgbeddings A Python package to generate embedding vectors from images, using OpenAI's robust CLIP model via Hugging Face transformers. These image em

Max Woolf 81 Jan 04, 2023
Easily Process a Batch of Cox Models

ezcox: Easily Process a Batch of Cox Models The goal of ezcox is to operate a batch of univariate or multivariate Cox models and return tidy result. โฌ

Shixiang Wang 15 May 23, 2022
PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021.

PAML PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021. (Continuously updating ) Int

15 Nov 18, 2022
Automatic caption evaluation metric based on typicality analysis.

SeMantic and linguistic UndeRstanding Fusion (SMURF) Automatic caption evaluation metric described in the paper "SMURF: SeMantic and linguistic UndeRs

Joshua Feinglass 6 Jan 09, 2022
TFOD-MASKRCNN - Tensorflow MaskRCNN With Python

Tensorflow- MaskRCNN Steps git clone https://github.com/amalaj7/TFOD-MASKRCNN.gi

Amal Ajay 2 Jan 18, 2022
Honours project, on creating a depth estimation map from two stereo images of featureless regions

image-processing This module generates depth maps for shape-blocked-out images Install If working with anaconda, then from the root directory: conda e

2 Oct 17, 2022
Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

Smaller Multilingual Transformers This repository shares smaller versions of multilingual transformers that keep the same representations offered by t

Geotrend 79 Dec 28, 2022
classify fashion-mnist dataset with pytorch

Fashion-Mnist Classifier with PyTorch Inference 1- clone this repository: git clone https://github.com/Jhamed7/Fashion-Mnist-Classifier.git 2- Instal

1 Jan 14, 2022
The implementation for the SportsCap (IJCV 2021)

SportsCap: Monocular 3D Human Motion Capture and Fine-grained Understanding in Challenging Sports Videos ProjectPage | Paper | Video | Dataset (Part01

Chen Xin 79 Dec 16, 2022
A transformer model to predict pathogenic mutations

MutFormer MutFormer is an application of the BERT (Bidirectional Encoder Representations from Transformers) NLP (Natural Language Processing) model wi

Wang Genomics Lab 2 Nov 29, 2022
Ensemble Visual-Inertial Odometry (EnVIO)

Ensemble Visual-Inertial Odometry (EnVIO) Authors : Jae Hyung Jung, Yeongkwon Choe, and Chan Gook Park 1. Overview This is a ROS package of Ensemble V

Jae Hyung Jung 95 Jan 03, 2023