git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Overview

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li

Accepted by CVPR 2021 (Oral). [Paper Link]

This repository includes Python (PyTorch) implementation of the TrDiMP and TrSiam trackers, to appear in CVPR 2021.

Abstract

In video object tracking, there exist rich temporal contexts among successive frames, which have been largely overlooked in existing trackers. In this work, we bridge the individual video frames and explore the temporal contexts across them via a transformer architecture for robust object tracking. Different from classic usage of the transformer in natural language processing tasks, we separate its encoder and decoder into two parallel branches and carefully design them within the Siamese-like tracking pipelines. The transformer encoder promotes the target templates via attention-based feature reinforcement, which benefits the high-quality tracking model generation. The transformer decoder propagates the tracking cues from previous templates to the current frame, which facilitates the object searching process. Our transformer-assisted tracking framework is neat and trained in an end-to-end manner. With the proposed transformer, a simple Siamese matching approach is able to outperform the current top-performing trackers. By combining our transformer with the recent discriminative tracking pipeline, our method sets several new state-of-the-art records on prevalent tracking benchmarks.

Tracking Results and Pretrained Model

Tracking results: the raw results of TrDiMP/TrSiam on 7 benchmarks including OTB, UAV, NFS, VOT2018, GOT-10k, TrackingNet, and LaSOT can be found here.

Pretrained model: please download the TrDiMP model and put it in the pytracking/networks folder.

TrDiMP and TrSiam share the same model. The main difference between TrDiMP and TrSiam lies in the tracking model generation. TrSiam does not utilize the background information and simply crops the target/foreground area to generate the tracking model, which can be regarded as the initialization step of TrDiMP.

Environment Setup

Clone the GIT repository.

git clone https://github.com/594422814/TransformerTrack.git

Clone the submodules.

In the repository directory, run the commands:

git submodule update --init  

Install dependencies

Run the installation script to install all the dependencies. You need to provide the conda install path (e.g. ~/anaconda3) and the name for the created conda environment (here pytracking).

bash install.sh conda_install_path pytracking

This script will also download the default networks and set-up the environment.

Note: The install script has been tested on an Ubuntu 18.04 system. In case of issues, check the detailed installation instructions.

Our code is based on the PyTracking framework. For more details, please refer to PyTracking.

Training the TrDiMP/TrSiam Model

Please refer to the README in the ltr folder.

Testing the TrDiMP/TrSiam Tracker

Please refer to the README in the pytracking folder. As shown in pytracking/README.md, you can either use this PyTracking toolkit or GOT-10k toolkit to reproduce the tracking results.

Citation

If you find this work useful for your research, please consider citing our work:

@inproceedings{Wang_2021_Transformer,
    title={Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking},
    author={Wang, Ning and Zhou, Wengang and Wang, Jie and Li, Houqiang},
    booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2021}
}

Acknowledgment

Our transformer-assisted tracker is based on PyTracking. We sincerely thank the authors Martin Danelljan and Goutam Bhat for providing this great framework.

Contact

If you have any questions, please feel free to contact [email protected]

Comments
  • some qusetions on testing VOT18

    some qusetions on testing VOT18

    when i use GOT10K_VOT.py to test VOT18, some errors will happen in the ./pytracking/tracker/trdimp.py, it seems some parameters do not match the test of vot18. But i use GOT10K_GOT.py to test GOT_10k, this errors didn't happen, because ./pytracking/tracker/trdimp_for_GOT.py works. Do you have specialied trdimp_for_VOT.py that uses on testing VOT18? Your prompt attention to my question would be highly appreciated!

    opened by yuyudiandian 4
  • Difference between TrSiam and TrDiMP

    Difference between TrSiam and TrDiMP

    Hi,

    Thanks for your good work!

    When I was reading your paper and your code I found that during the training phase, there is no difference between TrSiam and TrDimp. I want to to know if the difference between thiese two algorithms only occurs during the tracking phase? For TrSiam, you also used the DCF during the training phase. Is this true?

    Thanks

    opened by YanWanquan 3
  • Training

    Training

    In your code I found that you import Lasot, Got10k, TrackingNet and MSCOCOSeq. I want to know if you used all these datasets for training.

    In your paper, you said the batch size you used is 36 image pairs and there is total 1500 iterations per epoch. However, 1500*36=1,8000 which is much smaller thant the number of images in CoCo dataset. I remember batch_size * iterations = dataset_sizse. I am a newer in the tracking field, and just a little confused about this. Thanks.

    opened by YanWanquan 3
  • what is your hardware for trainning?

    what is your hardware for trainning?

    what is your hardware for trainning? Super_dimp use one TATIAN X for training. And your traning setting is slight different, I wonder what's the reason。

    opened by Lightning980729 3
  • No matching checkpoint file found

    No matching checkpoint file found

    I run with: python run_tracker.py trdimp trdimp --sequence Biker

    The reported error is : "No matching checkpoint file found"

    How should I repair this? or How should I set the checkpoint file?

    Thanks a lot.

    opened by gyc9709 3
  • some questions on testing

    some questions on testing

    I did some changes on the structure of code,but when testing GOT or OTB ,it will happen “cuda memory is not enough ”。Same question not happened on testing VOT18,so do you have some suggestions on this question?Or how to reduce the the cuda memory on testing dataset?Thank you.

    opened by yuyudiandian 2
  • no matching checkpoint file found

    no matching checkpoint file found

    Hi, when I want to run python run_training dimp transformer_dimp, it always shows no matching checkpoint file found. Does this model need a checkpoint file before training? If yes, where can I download this file?

    When I run to the 52 line of ltr_trainer.py file: for i, data in enumerate(loader,1), it shows an error: out of memory. My computer has 16G memory, and my GPU has 32G memory, so I think it is large enough to run the model. Can you give any suggestions about this problem? Thanks.

    opened by YanWanquan 2
  • Question about FFN

    Question about FFN

    Thanks for your work! I noticed that in the paper, you claimed that "To achieve a good balance of speed and performance, we slim the classic transformer by omitting the fully-connected feed-forward layers and maintaining the lightweight single-head attention", and it seemed that you also do not use the FFN layer in your code. I'm wondering how about the performance regardless of speed if you use the classic transformer including FFN ?

    opened by 3bobo 2
  • 使用got10k的测试工具报错

    使用got10k的测试工具报错

    image segmentation_dir: /trdimp/trdimp_vot Files already downloaded. Running tracker GOT_Tracker on VOT... Running supervised experiment... --Sequence 1/60: ants1 Repetition: 1 Traceback (most recent call last): File "GOT10k_VOT.py", line 45, in experiment.run(tracker, visualize=False) File "/home/admin312/anaconda3/envs/pytracking/lib/python3.7/site-packages/got10k/experiments/vot.py", line 71, in run self.run_supervised(tracker, visualize) File "/home/admin312/anaconda3/envs/pytracking/lib/python3.7/site-packages/got10k/experiments/vot.py", line 126, in run_supervised tracker.init(frame, anno_rects[0]) File "GOT10k_VOT.py", line 32, in init self.tracker.initialize(image, box) File "../pytracking/tracker/trdimp/trdimp.py", line 55, in initialize state = info['init_bbox'] IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

    看上去是版本导致的错误? 请问作者有遇到过或者有修改的思路吗

    opened by 1145284121 1
  • 训练和测试运行不报错,程序停止了

    训练和测试运行不报错,程序停止了

    您好,在使用您的代码过程中,我使用了自己的数据作为训练和测试。但是在训练或者测试过程中出现如下提示,但是不报错,程序也不在运行了。

    /home/miniconda3/envs/pytracking/bin/python /home/TransformerTrack-main/pytracking/run_tracker.py trdimp trsiam --dataset eotb --sequence val
    Evaluating    1 trackers on    31 sequences
    Tracker: trdimp trsiam None ,  Sequence: test_seq_474
    /home/miniconda3/envs/pytracking/lib/python3.8/site-packages/torch/nn/functional.py:3060: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
      warnings.warn("Default upsampling behavior when mode={} is changed "
    Using /home/.cache/torch_extensions as PyTorch extensions root...
    

    因为DIMP代码机制的问题,在训练的时候,如果ctrl+c代码则会重新运行,且运行正常。但是测试的时候不会。请问这是什么原因呢?

    opened by Jee-King 1
  • Some questions about train setting?

    Some questions about train setting?

    Some questions about your baseline? It seems that you use the same parameter of SuperDiMP, but not DiMP. Based on the SuperDiMP does achieves the performance described in your paper, but the performance of DiMP is doubtful. However, your paper is elaborated from the perspective of temporal feature, which is novel and great.

    opened by yxxxqqq 1
  • Question about source code of Transformer part

    Question about source code of Transformer part

    As illustrated in paper, Encoder is used to ehanced template feature(train_feat in dimp) only, and Decoder is used to produce decoded search feature. But in the transformer's forward function, decoder is also used on train_feat, which is not described in the paper. Could you please explain why?

    opened by ChuzzZz 0
  • linear transformation for key and query

    linear transformation for key and query

    I noticed that linear transformations used to reduce the dimension of key and query share same weight. Is it out of computaition consideration or used different weight degrades performance?

    opened by ChuzzZz 0
  • Problems in Training process

    Problems in Training process

    Hi, when I run python run_training dimp transformer_dimp, it shows ‘No matching checkpoint file found’ and ‘ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])’. The batch_size is set to 6. I don’t know why it change to 1 in the training process. Do you have any solutions to this problem? Thanks very much!

    The error message reported during the training process is shown below.

    _Training: dimp transformer_dimp No matching checkpoint file found Using /tmp/torch_extensions as PyTorch extensions root...Using /tmp/torch_extensions as PyTorch extensions root...

    Detected CUDA files, patching ldflags Emitting ninja build file /tmp/torch_extensions/_prroi_pooling/build.ninja... Building extension module _prroi_pooling... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) Using /tmp/torch_extensions as PyTorch extensions root... No modifications detected for re-loaded extension module _prroi_pooling, skipping build step... Loading extension module _prroi_pooling... ninja: no work to do. Loading extension module _prroi_pooling... Loading extension module _prroi_pooling... Training crashed at epoch 1 Traceback for the error! Traceback (most recent call last): File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/base_trainer.py", line 70, in train self.train_epoch() File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/ltr_trainer.py", line 80, in train_epoch self.cycle_dataset(loader) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/ltr_trainer.py", line 61, in cycle_dataset loss, stats = self.actor(data) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/actors/tracking.py", line 97, in call test_proposals=data['test_proposals']) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply output.reraise() File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) ValueError: Caught ValueError in replica 0 on device 0. Original Traceback (most recent call last): File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker output = module(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/tracking/dimpnet.py", line 75, in forward iou_pred = self.bb_regressor(train_feat_iou, test_feat_iou, train_bb, test_proposals) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/bbreg/atom_iou_net.py", line 86, in forward modulation = self.get_modulation(feat1, bb1) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/bbreg/atom_iou_net.py", line 162, in get_modulation fc3_r = self.fc3_1r(roi3r) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward input = module(input) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 106, in forward exponential_average_factor, self.eps) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/functional.py", line 1919, in batch_norm _verify_batch_size(input.size()) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/functional.py", line 1902, in verify_batch_size raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size)) ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

    opened by Chenlulu1993 2
Owner
NingWang
PhD student in University of Science and Technology of China (USTC)
NingWang
Source code of NeurIPS 2021 Paper ''Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration''

CaGCN This repo is for source code of NeurIPS 2021 paper "Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration". Paper L

6 Dec 19, 2022
TEA: A Sequential Recommendation Framework via Temporally Evolving Aggregations

TEA: A Sequential Recommendation Framework via Temporally Evolving Aggregations Requirements python 3.6 torch 1.9 numpy 1.19 Quick Start The experimen

DMIRLAB 4 Oct 16, 2022
Instant neural graphics primitives: lightning fast NeRF and more

Instant Neural Graphics Primitives Ever wanted to train a NeRF model of a fox in under 5 seconds? Or fly around a scene captured from photos of a fact

NVIDIA Research Projects 10.6k Jan 01, 2023
The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

OC-SORT Observation-Centric SORT (OC-SORT) is a pure motion-model-based multi-object tracker. It aims to improve tracking robustness in crowded scenes

Jinkun Cao 325 Jan 05, 2023
This is the official source code of "BiCAT: Bi-Chronological Augmentation of Transformer for Sequential Recommendation".

BiCAT This is our TensorFlow implementation for the paper: "BiCAT: Sequential Recommendation with Bidirectional Chronological Augmentation of Transfor

John 15 Dec 06, 2022
3D mesh stylization driven by a text input in PyTorch

Text2Mesh [Project Page] Text2Mesh is a method for text-driven stylization of a 3D mesh, as described in "Text2Mesh: Text-Driven Neural Stylization fo

Threedle (University of Chicago) 649 Dec 27, 2022
Pseudo-mask Matters in Weakly-supervised Semantic Segmentation

Pseudo-mask Matters in Weakly-supervised Semantic Segmentation By Yi Li, Zhanghui Kuang, Liyang Liu, Yimin Chen, Wayne Zhang SenseTime, Tsinghua Unive

33 Oct 14, 2022
LeViT a Vision Transformer in ConvNet's Clothing for Faster Inference

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference This repository contains PyTorch evaluation code, training code and pretrained

Facebook Research 504 Jan 02, 2023
Data & Code for ACCENTOR Adding Chit-Chat to Enhance Task-Oriented Dialogues

ACCENTOR: Adding Chit-Chat to Enhance Task-Oriented Dialogues Overview ACCENTOR consists of the human-annotated chit-chat additions to the 23.8K dialo

Facebook Research 69 Dec 29, 2022
PED: DETR for Crowd Pedestrian Detection

PED: DETR for Crowd Pedestrian Detection Code for PED: DETR For (Crowd) Pedestrian Detection Paper PED: DETR for Crowd Pedestrian Detection Installati

36 Sep 13, 2022
Fast and Easy Infinite Neural Networks in Python

Neural Tangents ICLR 2020 Video | Paper | Quickstart | Install guide | Reference docs | Release notes Overview Neural Tangents is a high-level neural

Google 1.9k Jan 09, 2023
✨风纪委员会自动投票脚本,利用Github Action帮你进行裁决操作(为了让其他风纪委员有案件可判,本程序从中午12点才开始运行,有需要请自己修改运行时间)

风纪委员会自动投票 本脚本通过使用Github Action来实现B站风纪委员的自动投票功能,喜欢请给我点个STAR吧! 如果你不是风纪委员,在符合风纪委员申请条件的情况下,本脚本会自动帮你申请 投票时间是早上八点,如果有需要请自行修改.github/workflows/Judge.yml中的时间,

Pesy Wu 25 Feb 17, 2021
NeurIPS workshop paper 'Counter-Strike Deathmatch with Large-Scale Behavioural Cloning'

Counter-Strike Deathmatch with Large-Scale Behavioural Cloning Tim Pearce, Jun Zhu Offline RL workshop, NeurIPS 2021 Paper: https://arxiv.org/abs/2104

Tim Pearce 169 Dec 26, 2022
Restricted Boltzmann Machines in Python.

How to Use First, initialize an RBM with the desired number of visible and hidden units. rbm = RBM(num_visible = 6, num_hidden = 2) Next, train the m

Edwin Chen 928 Dec 30, 2022
Official repository for the ICCV 2021 paper: UltraPose: Synthesizing Dense Pose with 1 Billion Points by Human-body Decoupling 3D Model.

UltraPose: Synthesizing Dense Pose with 1 Billion Points by Human-body Decoupling 3D Model Official repository for the ICCV 2021 paper: UltraPose: Syn

MomoAILab 92 Dec 21, 2022
Hyper-parameter optimization for sklearn

hyperopt-sklearn Hyperopt-sklearn is Hyperopt-based model selection among machine learning algorithms in scikit-learn. See how to use hyperopt-sklearn

1.4k Jan 01, 2023
Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

Visual 3D Detection Package: This repo aims to provide flexible and reproducible visual 3D detection on KITTI dataset. We expect scripts starting from

Yuxuan Liu 305 Dec 19, 2022
Optimize Trading Strategies Using Freqtrade

Optimize trading strategy using Freqtrade Short demo on building, testing and optimizing a trading strategy using Freqtrade. The DevBootstrap YouTube

DevBootstrap 139 Jan 01, 2023
TDmatch is a Python library developed to perform matching tasks in three categories:

TDmatch TDmatch is a Python library developed to perform matching tasks in three categories: Text to Data which matches tuples of a table to text docu

Naser Ahmadi 5 Aug 11, 2022
PyTorch implementation of the YOLO (You Only Look Once) v2

PyTorch implementation of the YOLO (You Only Look Once) v2 The YOLOv2 is one of the most popular one-stage object detector. This project adopts PyTorc

申瑞珉 (Ruimin Shen) 433 Nov 24, 2022