Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

Related tags

Deep LearningPTSNet
Overview

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

By Qiang Zhou*, Zilong Huang*, Lichao Huang, Han Shen, Yongchao Gong, Chang Huang, Wenyu Liu, Xinggang Wang.(* means equal contribution)

This code is the implementation mainly for DAVIS 2017 dataset. For more detail, please refer to our paper.

Architecture


Overview of our proposed PTSNet for video object segmentation. OPN is designed for generating proposals of the interested objects and OTN aims to distinguish which one of the proposals is the best. Finally, DRSN does the final pixel level tracking(segmentation) task. Note in our implementation we couple OPN and OTN as a whole network, and spearate DRSN out under engineering consideration.

Usage

Preparation

  1. Install PyTorch 1.0 and necessary libraries like opencv, PIL etc.

  2. There are some native CUDA implementations, InPlace-ABN and MaskRCNN Operators, which must be compiled at the very start.

    # Before you compile, you need to figure out several things:
    # - The CUDA kernels supported by your GPU, here we use `sm_52`, `sm_61` and `sm_70` for NVIDIA Titan V.
    # - `cuda` and `nvcc` paths in your operating system, which exist usually in `/usr/local/cuda` and `/usr/local/cuda/bin/nvcc` respectively.
    # InPlace-ABN_0.4   (PyTorch 0.4)
    cd model/inplace_ABN_0.4
    bash build.sh
    # OR you could choose the 1.0 version of inplace ABN.
    # InPlace-ABN_1.0   (PyTorch 1.0)
    cd model/inplace_ABN    # It is dynamically compiled when running (gcc > 4.9)
    
    # MaskRCNN Operators (PyTorch 0.4)
    cd coupled_otn_opn/tracking/maskrcnn/lib
    bash make.sh
  3. You can train PTSNet from scratch or just evaluate our pretrained model.

    • Train it from scratch, you need to download:

       # DRSN: wget "https://download.pytorch.org/models/resnet50-19c8e357.pth" -O drsn/init_models/resnet50-19c8e357.pth
       # OPN: wget "https://drive.google.com/open?id=1ma1fNmEvS9dJLOIcm1FRzYofVS_t3aI3" -O coupled_otn_opn/tracking/maskrcnn/data/X-152-32x8d-IN5k.pkl
       # If you want to use our pretrained OTN:
       #   wget https://drive.google.com/open?id=12bF1dRlEUZoQz3Qcr2WD3ojqNHzbCrjf, put it into `coupled_otn_opn/models/mdnet_davis_50cyche.pth`
       # Else please modify from py-MDNet(https://github.com/HyeonseobNam/py-MDNet) to train OTN on DAVIS by yourself.
    • If you want to use our pretrained model to do the evaluation, you need to download:

       # DRSN: https://drive.google.com/open?id=116yXnqX43BZ7kEgdzUhIeTSn1dbvcE2F, put it into `drsn/snapshots/drsn_yvos_10w_davis_3p5w.pth`
       # OPN: wget "https://drive.google.com/open?id=1ma1fNmEvS9dJLOIcm1FRzYofVS_t3aI3" -O coupled_otn_opn/tracking/maskrcnn/data/X-152-32x8d-IN5k.pkl
       # OTN: https://drive.google.com/open?id=12bF1dRlEUZoQz3Qcr2WD3ojqNHzbCrjf, put it into `coupled_otn_opn/models/mdnet_davis_50cycle.pth`
  4. Dataset

    • YouTube-VOS: Download from YouTube-VOS, note we only need the training part(train_all_frames.zip), totally about 41G. Unzip, move and rename it to drsn/dataset/yvos.
    • DAVIS: Download from DAVIS, note we only need the 480p version(DAVIS-2017-trainval-480p.zip). Unzip, move and rename it to drsn/dataset/DAVIS/trainval and coupled_otn_opn/DAVIS/trainval. Here you need to make a subdirectory of trainval directory to store the dataset.

    And make sure to put the files as the following structure:

    .
    ├── drsn
    │   ├── dataset
    │   │   ├── DAVIS
    │   │   │   └── trainval
    │   │   │       ├── Annotations
    │   │   │       ├── ImageSets
    │   │   │       └── JPEGImages
    │   │   └── yvos
    │   │       └── train_all_frames
    │   ├── init_model
    │   │   └── resnet50-19c8e357.pth
    │   └── snapshots
    │       └── drsn_yvos_10w_davis_3p5w.pth
    └── coupled_otn_opn
        ├── DAVIS
        │   └── trainval
        ├── models
        │   └── mdnet_davis_50cycle.pth
        └── tracking
            └── maskrcnn
                └── data
                    └── X-152-32x8d-FPN-IN5k.pkl
    

Train and Evaluate

  • Firstly, check the directory of coupled_otn_opn and follow the README.md inside to generate our proposals. You can also skip this step for we have provided generated proposals in drsn/dataset/result_davis directory.
  • Secondly, enter drsn and check do_train_eval.sh to train and evaluate.
  • Finally, we also provide result masks by our PTSNet in result-masks-GoogleDrive. The quantitative results are measured by DAVIS official matlab toolbox.
J Mean F Mean G Mean
Avg 71.6 77.7 74.7

Acknowledgment

The work was mainly done during an internship at Horizon Robotics.

Citing PTSNet

If you find PTSNet useful in your research, please consider citing:

@article{ptsnet2019,
        title={Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation},
        author={Zhou, Qiang and Huang, Zilong and Huang, Lichao and Han, Shen and Gong, Yongchao and Huang, Chang and Liu, Wenyu and Wang, Xinggang},
        journal = {arXiv preprint arXiv:1907.01203v2},
        year={2019}
        }

Thanks to the Third Party Libs

Owner
Forest
If a bullet's going to get you, it has already been fired.
Forest
Code for testing convergence rates of Lipschitz learning on graphs

📈 LipschitzLearningRates The code in this repository reproduces the experimental results on convergence rates for k-nearest neighbor graph infinity L

2 Dec 20, 2021
BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment

BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment

Holy Wu 35 Jan 01, 2023
Tensorflow python implementation of "Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos"

Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos This repository is the official tensorflow python implementation

Yasamin Jafarian 287 Jan 06, 2023
PolyTrack: Tracking with Bounding Polygons

PolyTrack: Tracking with Bounding Polygons Abstract In this paper, we present a novel method called PolyTrack for fast multi-object tracking and segme

Gaspar Faure 13 Sep 15, 2022
Implementation of Sequence Generative Adversarial Nets with Policy Gradient

SeqGAN Requirements: Tensorflow r1.0.1 Python 2.7 CUDA 7.5+ (For GPU) Introduction Apply Generative Adversarial Nets to generating sequences of discre

Lantao Yu 2k Dec 29, 2022
Deep Learning Models for Causal Inference

Extensive tutorials for learning how to build deep learning models for causal inference using selection on observables in Tensorflow 2.

Bernard J Koch 151 Dec 31, 2022
tensorflow code for inverse face rendering

InverseFaceRender This is tensorflow code for our project: Learning Inverse Rendering of Faces from Real-world Videos. (https://arxiv.org/abs/2003.120

Yuda Qiu 18 Nov 16, 2022
[CVPR 2021] "Multimodal Motion Prediction with Stacked Transformers": official code implementation and project page.

mmTransformer Introduction This repo is official implementation for mmTransformer in pytorch. Currently, the core code of mmTransformer is implemented

DeciForce: Crossroads of Machine Perception and Autonomy 232 Dec 31, 2022
Repository for "Space-Time Correspondence as a Contrastive Random Walk" (NeurIPS 2020)

Space-Time Correspondence as a Contrastive Random Walk This is the repository for Space-Time Correspondence as a Contrastive Random Walk, published at

A. Jabri 239 Dec 27, 2022
Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neural Network

ild-cnn This is supplementary material for the manuscript: "Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neur

22 Nov 05, 2022
Using deep learning model to detect breast cancer.

Breast-Cancer-Detection Breast cancer is the most frequent cancer among women, with around one in every 19 women at risk. The number of cases of breas

1 Feb 13, 2022
Position detection system of mobile robot in the warehouse enviroment

Autonomous-Forklift-System About | GUI | Tests | Starting | License | Author | 🎯 About An application that run the autonomous forklift paletization a

Kamil Goś 1 Nov 24, 2021
A PyTorch-Based Framework for Deep Learning in Computer Vision

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision @misc{you2019torchcv, author = {Ansheng You and Xiangtai Li and Zhen Zhu a

Donny You 2.2k Jan 09, 2023
Dynamic Head: Unifying Object Detection Heads with Attentions

Dynamic Head: Unifying Object Detection Heads with Attentions dyhead_video.mp4 This is the official implementation of CVPR 2021 paper "Dynamic Head: U

Microsoft 550 Dec 21, 2022
Tool for live presentations using manim

manim-presentation Tool for live presentations using manim Install pip install manim-presentation opencv-python Usage Use the class Slide as your sce

Federico Galatolo 146 Jan 06, 2023
Streamlit App For Product Analysis - Streamlit App For Product Analysis

Streamlit_App_For_Product_Analysis Здравствуйте! Перед вами дашборд, позволяющий

Grigory Sirotkin 1 Jan 10, 2022
Code for our paper A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization,

FSRA This repository contains the dataset link and the code for our paper A Transformer-Based Feature Segmentation and Region Alignment Method For UAV

Dmmm 32 Dec 18, 2022
A PyTorch version of You Only Look at One-level Feature object detector

PyTorch_YOLOF A PyTorch version of You Only Look at One-level Feature object detector. The input image must be resized to have their shorter side bein

Jianhua Yang 25 Dec 30, 2022
Trafffic prediction analysis using hybrid models - Machine Learning

Hybrid Machine learning Model Clone the Repository Create a new Directory as assests and download the model from the below link Model Link To Start th

1 Feb 08, 2022
RODD: A Self-Supervised Approach for Robust Out-of-Distribution Detection

RODD Official Implementation of 2022 CVPRW Paper RODD: A Self-Supervised Approach for Robust Out-of-Distribution Detection Introduction: Recent studie

Umar Khalid 17 Oct 11, 2022