Code release for "COTR: Correspondence Transformer for Matching Across Images"

Last update: Dec 24, 2022

Related tags

Overview

COTR: Correspondence Transformer for Matching Across Images

This repository contains the inference code for COTR. We plan to release the training code in the future. COTR establishes correspondence in a functional and end-to-end fashion. It solves dense and sparse correspondence problem in the same framework.

Demos

Check out our demo video at here.

1. Install environment

Our implementation is based on PyTorch. Install the conda environment by: conda env create -f environment.yml.

Activate the environment by: conda activate cotr_env.

Notice that we use scipy=1.2.1 .

2. Download the pretrained weights

Down load the pretrained weights at here. Extract in to ./out, such that the weights file is at /out/default/checkpoint.pth.tar.

3. Single image pair demo

python demo_single_pair.py --load_weights="default"

Example sparse output:

Example dense output with triangulation:

Note: This example uses 10K valid sparse correspondences to densify.

4. Facial landmarks demo

python demo_face.py --load_weights="default"

Example:

5. Homography demo

python demo_homography.py --load_weights="default"

Citation

If you use this code in your research, cite the paper:

@article{jiang2021cotr,
  title={{COTR: Correspondence Transformer for Matching Across Images}},
  author={Wei Jiang and Eduard Trulls and Jan Hosang and Andrea Tagliasacchi and Kwang Moo Yi},
  booktitle={arXiv preprint},
  publisher_page={https://arxiv.org/abs/2103.14167},
  year={2021}
}

Code release for "COTR: Correspondence Transformer for Matching Across Images"

Related tags

Overview

COTR: Correspondence Transformer for Matching Across Images

Demos

1. Install environment

2. Download the pretrained weights

3. Single image pair demo

4. Facial landmarks demo

5. Homography demo

Citation

Owner

UBC Computer Vision Group

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2021).

The Classical Language Toolkit

A large-scale (194k), Multiple-Choice Question Answering (MCQA) dataset designed to address realworld medical entrance exam questions.

VD-BERT: A Unified Vision and Dialog Transformer with BERT

A natural language processing model for sequential sentence classification in medical abstracts.

Part of Speech Tagging using Hidden Markov Model (HMM) POS Tagger and Brill Tagger

Watson Natural Language Understanding and Knowledge Studio

Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech (BVAE-TTS)

Transformer related optimization, including BERT, GPT

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

Paradigm Shift in NLP - "Paradigm Shift in Natural Language Processing".

Traditional Chinese Text Recognition Dataset: Synthetic Dataset and Labeled Data

TTS is a library for advanced Text-to-Speech generation.

Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.

Write Alphabet, Words and Sentences with your eyes.

A fast and easy implementation of Transformer with PyTorch.

基于Transformer的单模型、多尺度的VAE模型

Search with BERT vectors in Solr and Elasticsearch

Blazing fast language detection using fastText model

내부 작업용 django + vue(vuetify) boilerplate. 짠 하면 돌아감.