[ICCV2021] Learning to Track Objects from Unlabeled Videos

Related tags

Deep LearningUSOT
Overview

Unsupervised Single Object Tracking (USOT)

🌿 Learning to Track Objects from Unlabeled Videos

Jilai Zheng, Chao Ma, Houwen Peng and Xiaokang Yang

2021 IEEE/CVF International Conference on Computer Vision (ICCV)

Introduction

This repository implements unsupervised deep tracker USOT, which learns to track objects from unlabeled videos.

Main ideas of USOT are listed as follows.

  • Coarsely discovering moving objects from videos, with pseudo boxes precise enough for bbox regression.
  • Training a naive Siamese tracker from single-frame pairs, then gradually extending it to longer temporal spans.
  • Following cycle memory training paradigm, enabling unsupervised tracker to update online.

Results

Results of USOT and USOT* on recent tracking benchmarks.

Model VOT2016
EAO
VOT2018
EAO
VOT2020
EAO
LaSOT
AUC (%)
TrackingNet
AUC (%)
OTB100
AUC (%)
USOT 0.351 0.290 0.222 33.7 59.9 58.9
USOT* 0.402 0.344 0.219 35.8 61.5 57.4

Raw result files can be found in folder result from Google Drive.

Tutorial

Environments

The environment we utilize is listed as follows.

  • Preprocessing: Pytorch 1.1.0 + CUDA-9.0 / 10.0 (following ARFlow)
  • Train / Test / Eval: Pytorch 1.7.1 + CUDA-10.0 / 10.2 / 11.1

If you have problems for preprocessing, you can actually skip it by downloading off-the-shelf preprocessed materials.

Preparations

Assume the project root path is $USOT_PATH. You can build an environment for development with the provided script, where $CONDA_PATH denotes your anaconda path.

cd $USOT_PATH
bash ./preprocessing/install_model.sh $CONDA_PATH USOT
source activate USOT && export PYTHONPATH=$(pwd)

You can revise the CUDA toolkit version for pytorch in install_model.sh (by default 10.0).

Test and Eval

First, we provide both models utilized in our paper (USOT.pth and USOT_star.pth). You can download them in folder snapshot from Google Drive, and place them in $USOT_PATH/var/snapshot.

Next, you can link your wanted benchmark dataset (e.g. VOT2018) to $USOT_PATH/datasets_test as follows. The ground truth json files for some benchmarks (e.g VOT2018.json) can be downloaded in folder test from Google Drive, and placed also in $USOT_PATH/datasets_test.

cd $USOT_PATH && mkdir datasets_test
ln -s $your_benchmark_path ./datasets_test/VOT2018

After that, you can test the tracker on these benchmarks (e.g. VOT2018) as follows. The raw results will be placed in $USOT_PATH/var/result/VOT2018/USOT.

cd $USOT_PATH
python -u ./scripts/test_usot.py --dataset VOT2018 --resume ./var/snapshot/USOT_star.pth

The inference result can be evaluated with pysot-toolkit. Install pysot-toolkit before evaluation.

cd $USOT_PATH/lib/eval_toolkit/pysot/utils
python setup.py build_ext --inplace

Then the evaluation can be conducted as follows.

cd $USOT_PATH
python ./lib/eval_toolkit/bin/eval.py --dataset_dir datasets_test \
        --dataset VOT2018 --tracker_result_dir var/result/VOT2018 --trackers USOT

Train

First, download the pretrained backbone in folder pretrain from Google Drive into $USOT_PATH/pretrain. Note that USOT* and USOT are respectively trained from imagenet_pretrain.model and moco_v2_800.model.

Second, preprocess the raw datasets with the paradigm of DP + Flow. Refer to $USOT_PATH/preprocessing/datasets_train for details.

In fact, we have provided two shortcuts for skipping this preprocessing procedure.

  • You can directly download the generated pseudo box files (e.g. got10k_flow.json) in folder train/box_sample_result from Google Drive, and place them into the corresponding dataset preprocessing path (e.g. $USOT_PATH/preprocessing/datasets_train/got10k), in order to skip the box generation procedure.
  • You can directly download the whole cropped training dataset (e.g. got10k_flow.tar) in dataset folder from Google Drive (Coming soon) (e.g. train/GOT-10k), which enables you to skip all procedures in preprocessing.

Third, revise the config file for training as $USOT_PATH/experiments/train/USOT.yaml. Very important options are listed as follows.

  • GPUS: the gpus for training, e.g. '0,1,2,3'
  • TRAIN/PRETRAIN: the pretrained backbone, e.g. 'imagenet_pretrain.model'
  • DATASET: the folder for your cropped training instances and their pseudo annotation files, e.g. PATH: '/data/got10k_flow/crop511/', ANNOTATION: '/data/got10k_flow/train.json'

Finally, you can start the training phase with the following script. The training checkpoints will also be placed automatically in $USOT_PATH/var/snapshot.

cd $USOT_PATH
python -u ./scripts/train_usot.py --cfg experiments/train/USOT.yaml --gpus 0,1,2,3 --workers 32

We also provide a onekey script for train, test and eval.

cd $USOT_PATH
python ./scripts/onekey_usot.py --cfg experiments/train/USOT.yaml

Citation

If any parts of our paper and codes are helpful to your work, please generously citing:

@inproceedings{zheng-iccv2021-usot,
   title={Learning to Track Objects from Unlabeled Videos},
   author={Jilai Zheng and Chao Ma and Houwen Peng and Xiaokang Yang},
   booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
   year={2021}
}

Reference

We refer to the following repositories when implementing our unsupervised tracker. Thanks for their great work.

Contact

Feel free to contact me if you have any questions.

Spam your friends and famly and when you do your famly will disown you and you will have no friends.

SpamBot9000 Spam your friends and family and when you do your family will disown you and you will have no friends. Terms of Use Disclaimer: Please onl

DJ15 0 Jun 09, 2022
style mixing for animation face

An implementation of StyleGAN on Animation dataset. Install git clone https://github.com/MorvanZhou/anime-StyleGAN cd anime-StyleGAN pip install -r re

Morvan 46 Nov 30, 2022
face2comics by Sxela (Alex Spirin) - face2comics datasets

This is a paired face to comics dataset, which can be used to train pix2pix or similar networks.

Alex 164 Nov 13, 2022
A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data

A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data Overview Clustering analysis is widely utilized in single-cell RNA-seque

AI-Biomed @NSCC-gz 3 May 08, 2022
Open source implementation of "A Self-Supervised Descriptor for Image Copy Detection" (SSCD).

A Self-Supervised Descriptor for Image Copy Detection (SSCD) This is the open-source codebase for "A Self-Supervised Descriptor for Image Copy Detecti

Meta Research 68 Jan 04, 2023
A machine learning malware analysis framework for Android apps.

🕵️ A machine learning malware analysis framework for Android apps. ☢️ DroidDetective is a Python tool for analysing Android applications (APKs) for p

James Stevenson 77 Dec 27, 2022
Code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language"

The repository provides the source code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language" submitted to HA

Sherzod Hakimov 3 Aug 04, 2022
clDice - a Novel Topology-Preserving Loss Function for Tubular Structure Segmentation

README clDice - a Novel Topology-Preserving Loss Function for Tubular Structure Segmentation CVPR 2021 Authors: Suprosanna Shit and Johannes C. Paetzo

110 Dec 29, 2022
Easy-to-use library to boost AI inference leveraging state-of-the-art optimization techniques.

NEW RELEASE How Nebullvm Works • Tutorials • Benchmarks • Installation • Get Started • Optimization Examples Discord | Website | LinkedIn | Twitter Ne

Nebuly 1.7k Dec 31, 2022
Introduction to AI assignment 1 HCM University of Technology, term 211

Sokoban Bot Introduction to AI assignment 1 HCM University of Technology, term 211 Abstract This is basically a solver for Sokoban game using Breadth-

Quang Minh 4 Dec 12, 2022
Image Captioning on google cloud platform based on iot

Image-Captioning-on-google-cloud-platform-based-on-iot - Image Captioning on google cloud platform based on iot

Shweta_kumawat 1 Jan 20, 2022
Explaining neural decisions contrastively to alternative decisions.

Contrastive Explanations for Model Interpretability This is the repository for the paper "Contrastive Explanations for Model Interpretability", about

AI2 16 Oct 16, 2022
Code for the paper "Reinforcement Learning as One Big Sequence Modeling Problem"

Trajectory Transformer Code release for Reinforcement Learning as One Big Sequence Modeling Problem. Installation All python dependencies are in envir

Michael Janner 269 Jan 05, 2023
Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Style Transformer for Image Inversion and Editing (CVPR2022) https://arxiv.org/abs/2203.07932 Existing GAN inversion methods fail to provide latent co

Xueqi Hu 153 Dec 02, 2022
A tool to visualise the results of AlphaFold2 and inspect the quality of structural predictions

AlphaFold Analyser This program produces high quality visualisations of predicted structures produced by AlphaFold. These visualisations allow the use

Oliver Powell 3 Nov 13, 2022
TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios

TPH-YOLOv5 This repo is the implementation of "TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured

cv516Buaa 439 Dec 22, 2022
Source code for the paper "SEPP: Similarity Estimation of Predicted Probabilities for Defending and Detecting Adversarial Text" PACLIC 2021

Adversarial text generator Refer to "adversarial_text_generator"[https://github.com/quocnsh/SEPP_generator] project for generating adversarial texts A

0 Oct 05, 2021
An open source python library for automated feature engineering

"One of the holy grails of machine learning is to automate more and more of the feature engineering process." ― Pedro Domingos, A Few Useful Things to

alteryx 6.4k Jan 03, 2023
PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Long Short-Term Transformer for Online Action Detection Introduction This is a PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short

77 Dec 16, 2022
Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection This material is supplementray code for paper accepted in ICDAR 2021 We h

NCSOFT 30 Dec 21, 2022