Official implementation of CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

Last update: Jan 04, 2023

Overview

CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

For more information, check out the paper on [arXiv].

Training with different backbones and evaluations of them are to be updated soon..

Check out our new paper! [arXiv]

Network

Our model CATs is illustrated below:

Environment Settings

git clone https://github.com/SunghwanHong/CATs
cd CATs

conda create -n CATs python=3.6
conda activate CATs

pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install -U scikit-image
pip install git+https://github.com/albumentations-team/albumentations
pip install tensorboardX termcolor timm tqdm requests pandas

Evaluation

Download pre-trained weights on Link
All datasets are automatically downloaded into directory specified by argument datapath

Result on SPair-71k: (PCK 49.9%)

  python test.py --pretrained "/path_to_pretrained_model/spair" --benchmark spair

Result on SPair-71k, feature backbone frozen: (PCK 42.4%)

  python test.py --pretrained "/path_to_pretrained_model/spair_frozen" --benchmark spair

Results on PF-PASCAL: (PCK 75.4%, 92.6%, 96.4%)

  python test.py --pretrained "/path_to_pretrained_model/pfpascal" --benchmark pfpascal

Results on PF-PACAL, feature backbone frozen: (PCK 67.5%, 89.1%, 94.9%)

  python test.py --pretrained "/path_to_pretrained_model/pfpascal_frozen" --benchmark pfpascal

Acknowledgement

We borrow code from public projects (huge thanks to all the projects). We mainly borrow code from DHPF and GLU-Net.

BibTeX

If you find this research useful, please consider citing:

@inproceedings{cho2021cats,
  title={CATs: Cost Aggregation Transformers for Visual Correspondence},
  author={Cho, Seokju and Hong, Sunghwan and Jeon, Sangryul and Lee, Yunsung and Sohn, Kwanghoon and Kim, Seungryong},
  booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
  year={2021}
}

Official implementation of CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

Related tags

Overview

CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

Network

Environment Settings

Evaluation

Acknowledgement

BibTeX

Owner

Sunghwan Hong

ExCon: Explanation-driven Supervised Contrastive Learning

render sprites into your desktop environment as shaped windows using GTK

An ever-growing playground of notebooks showcasing CLIP's impressive zero-shot capabilities.

Implementation for our ICCV2021 paper: Internal Video Inpainting by Implicit Long-range Propagation

Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages"

Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning

efficient neural audio synthesis in the waveform domain

A Python script that creates subtitles of a given length from text paragraphs that can be easily imported into any Video Editing software such as FinalCut Pro for further adjustments.

Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

Spatially-Adaptive Pixelwise Networks for Fast Image Translation, CVPR 2021

A self-supervised learning framework for audio-visual speech

A PyTorch re-implementation of the paper 'Exploring Simple Siamese Representation Learning'. Reproduced the 67.8% Top1 Acc on ImageNet.

这是一个unet-pytorch的源码，可以训练自己的模型

This is the official implementation of our proposed SwinMR

Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning

This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive Selective Coding)

Learning Off-Policy with Online Planning, CoRL 2021

PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM

Pytorch Implementation of Residual Vision Transformers(ResViT)

PyTorch Implementation of "Light Field Image Super-Resolution with Transformers"