Official implementation of TMANet.

Related tags

Deep LearningTMANet
Overview

Temporal Memory Attention for Video Semantic Segmentation, arxiv

PWC PWC

Introduction

We propose a Temporal Memory Attention Network (TMANet) to adaptively integrate the long-range temporal relations over the video sequence based on the self-attention mechanism without exhaustive optical flow prediction. Our method achieves new state-of-the-art performances on two challenging video semantic segmentation datasets, particularly 80.3% mIoU on Cityscapes and 76.5% mIoU on CamVid with ResNet-50. (Accepted by ICIP2021)

If this codebase is helpful for you, please consider give me a star โญ ๐Ÿ˜Š .

image

Updates

2021/1: TMANet training and evaluation code released.

2021/6: Update README.md:

  • adding some Camvid dataset download links;
  • update 'camvid_video_process.py' script.

Usage

  • Install mmseg

    • Please refer to mmsegmentation to get installation guide.
    • This repository is based on mmseg-0.7.0 and pytorch 1.6.0.
  • Clone the repository

    git clone https://github.com/wanghao9610/TMANet.git
    cd TMANet
    pip install -e .
  • Prepare the datasets

    • Download Cityscapes dataset and Camvid dataset.

    • For Camvid dataset, we need to extract frames from downloaded videos according to the following steps:

      • Download the raw video from here, in which I provide a google drive link to download.
      • Put the downloaded raw video(e.g. 0016E5.MXF, 0006R0.MXF, 0005VD.MXF, 01TP_extract.avi) to ./data/camvid/raw .
      • Download the extracted images and labels from here and split.txt file from here, untar the tar.gz file to ./data/camvid , and we will get two subdirs "./data/camvid/images" (stores the images with annotations), and "./data/camvid/labels" (stores the ground truth for semantic segmentation). Reference the following shell command:
        cd TMANet
        cd ./data/camvid
        wget https://drive.google.com/file/d/1FcVdteDSx0iJfQYX2bxov0w_j-6J7plz/view?usp=sharing
        # or first download on your PC then upload to your server.
        tar -xf camvid.tar.gz 
      • Generate image_sequence dir frame by frame from the raw videos. Reference the following shell command:
        cd TMANet
        python tools/convert_datasets/camvid_video_process.py
    • For Cityscapes dataset, we need to request the download link of 'leftImg8bit_sequence_trainvaltest.zip' from Cityscapes dataset official webpage.

    • The converted/downloaded datasets store on ./data/camvid and ./data/cityscapes path.

      File structure of video semantic segmentation dataset is as followed.

      โ”œโ”€โ”€ data                                              โ”œโ”€โ”€ data                              
      โ”‚   โ”œโ”€โ”€ cityscapes                                    โ”‚   โ”œโ”€โ”€ camvid                        
      โ”‚   โ”‚   โ”œโ”€โ”€ gtFine                                    โ”‚   โ”‚   โ”œโ”€โ”€ images                    
      โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ train                                 โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ xxx{img_suffix}       
      โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ xxx{img_suffix}                   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ yyy{img_suffix}       
      โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ yyy{img_suffix}                   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ zzz{img_suffix}       
      โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ zzz{img_suffix}                   โ”‚   โ”‚   โ”œโ”€โ”€ annotations               
      โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ val                                   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ train.txt             
      โ”‚   โ”‚   โ”œโ”€โ”€ leftImg8bit                               โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ val.txt               
      โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ train                                 โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ test.txt              
      โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ xxx{seg_map_suffix}               โ”‚   โ”‚   โ”œโ”€โ”€ labels                    
      โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ yyy{seg_map_suffix}               โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ xxx{seg_map_suffix}   
      โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ zzz{seg_map_suffix}               โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ yyy{seg_map_suffix}   
      โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ val                                   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ zzz{seg_map_suffix}   
      โ”‚   โ”‚   โ”œโ”€โ”€ leftImg8bit_sequence                      โ”‚   โ”‚   โ”œโ”€โ”€ image_sequence            
      โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ train                                 โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ xxx{sequence_suffix}  
      โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ xxx{sequence_suffix}              โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ yyy{sequence_suffix}  
      โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ yyy{sequence_suffix}              โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ zzz{sequence_suffix}  
      โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ zzz{sequence_suffix}              
      โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ val                                   
      
  • Evaluation

    • Download the trained models for Cityscapes and Camvid. And put them on ./work_dirs/{config_file}
    • Run the following command(on Cityscapes):
    sh eval.sh configs/video/cityscapes/tmanet_r50-d8_769x769_80k_cityscapes_video.py
  • Training

    • Please download the pretrained ResNet-50 model, and put it on ./init_models .
    • Run the following command(on Cityscapes):
    sh train.sh configs/video/cityscapes/tmanet_r50-d8_769x769_80k_cityscapes_video.py

    Note: the above evaluation and training shell commands execute on Cityscapes, if you want to execute evaluation or training on Camvid, please replace the config file on the shell command with the config file of Camvid.

Citation

If you find TMANet is useful in your research, please consider citing:

@misc{wang2021temporal,
    title={Temporal Memory Attention for Video Semantic Segmentation}, 
    author={Hao Wang and Weining Wang and Jing Liu},
    year={2021},
    eprint={2102.08643},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Acknowledgement

Thanks mmsegmentation contribution to the community!

Owner
wanghao
wanghao
Point detection through multi-instance deep heatmap regression for sutures in endoscopy

Suture detection PyTorch This repo contains the reference implementation of suture detection model in PyTorch for the paper Point detection through mu

artificial intelligence in the area of cardiovascular healthcare 3 Jul 16, 2022
Image Recognition using Pytorch

PyTorch Project Template A simple and well designed structure is essential for any Deep Learning project, so after a lot practice and contributing in

Sarat Chinni 1 Nov 02, 2021
The implementation code for "DAGAN: Deep De-Aliasing Generative Adversarial Networks for Fast Compressed Sensing MRI Reconstruction"

DAGAN This is the official implementation code for DAGAN: Deep De-Aliasing Generative Adversarial Networks for Fast Compressed Sensing MRI Reconstruct

TensorLayer Community 159 Nov 22, 2022
Physics-informed Neural Operator for Learning Partial Differential Equation

PINO Physics-informed Neural Operator for Learning Partial Differential Equation Abstract: Machine learning methods have recently shown promise in sol

107 Jan 02, 2023
This codebase is the official implementation of Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization (NeurIPS2021, Spotlight)

Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization This codebase is the official implementation of Test-Time Classifier A

47 Dec 28, 2022
Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels

The official code for the NeurIPS 2021 paper Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels

13 Dec 22, 2022
PyTorch implementation of Self-supervised Contrastive Regularization for DG (SelfReg)

SelfReg PyTorch official implementation of Self-supervised Contrastive Regularization for Domain Generalization (SelfReg, https://arxiv.org/abs/2104.0

64 Dec 16, 2022
Pytorch implementation of MalConv

MalConv-Pytorch A Pytorch implementation of MalConv Desciprtion This is the implementation of MalConv proposed in Malware Detection by Eating a Whole

Alexander H. Liu 58 Oct 26, 2022
Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation

Auto-Seg-Loss By Hao Li, Chenxin Tao, Xizhou Zhu, Xiaogang Wang, Gao Huang, Jifeng Dai This is the official implementation of the ICLR 2021 paper Auto

61 Dec 21, 2022
Global Rhythm Style Transfer Without Text Transcriptions

Global Prosody Style Transfer Without Text Transcriptions This repository provides a PyTorch implementation of AutoPST, which enables unsupervised glo

Kaizhi Qian 193 Dec 30, 2022
A Pytorch implementation of CVPR 2021 paper "RSG: A Simple but Effective Module for Learning Imbalanced Datasets"

RSG: A Simple but Effective Module for Learning Imbalanced Datasets (CVPR 2021) A Pytorch implementation of our CVPR 2021 paper "RSG: A Simple but Eff

120 Dec 12, 2022
Unofficial Implementation of MLP-Mixer, Image Classification Model

MLP-Mixer Unoffical Implementation of MLP-Mixer, easy to use with terminal. Train and test easly. https://arxiv.org/abs/2105.01601 MLP-Mixer is an arc

OฤŸuzhan Ercan 6 Dec 05, 2022
Qt-GUI implementation of the YOLOv5 algorithm (ver.6 and ver.5)

YOLOv5-GUI ๐ŸŽ‰ YOLOv5็ฎ—ๆณ•(ver.6ๅŠver.5)็š„Qt-GUIๅฎž็Žฐ ๐ŸŽ‰ Qt-GUI implementation of the YOLOv5 algorithm (ver.6 and ver.5). ๅŸบไบŽYOLOv5็š„v5็‰ˆๆœฌๅ’Œv6็‰ˆๆœฌๅŠJavacrๅคงไฝฌ็š„UI้€ป่พ‘่ฟ›่กŒ็ผ–ๅ†™

EricFang 12 Dec 28, 2022
Object Detection with YOLOv3

Object Detection with YOLOv3 Bu projede YOLOv3-608 modeli kullanฤฑlmฤฑลŸtฤฑr. Requirements Python 3.8 OpenCV Numpy Documentation Yolo ile ilgili detaylฤฑ b

AyลŸe KonuลŸ 0 Mar 27, 2022
Performant, differentiable reinforcement learning

deluca Performant, differentiable reinforcement learning Notes This is pre-alpha software and is undergoing a number of core changes. Updates to follo

Google 114 Dec 27, 2022
A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering.

DeepFilterNet A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering. libDF contains Rust code used for dat

Hendrik Schrรถter 292 Dec 25, 2022
Official Pytorch Code for the paper TransWeather

TransWeather Official Code for the paper TransWeather, Arxiv Tech Report 2021 Paper | Website About this repo: This repo hosts the implentation code,

Jeya Maria Jose 81 Dec 30, 2022
Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL)

LUPerson-NL Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL) The repository is for our CVPR2022 paper Large-Scale

43 Dec 26, 2022
An open source implementation of CLIP.

OpenCLIP Welcome to an open source implementation of OpenAI's CLIP (Contrastive Language-Image Pre-training). The goal of this repository is to enable

2.7k Dec 31, 2022
The materials used in the SaxonJS tutorial presented at Declarative Amsterdam, 2021

SaxonJS-Tutorial-2021, version 1.0.4 Last updated on 4 November, 2021. Table of contents Background Prerequisites Starting a web server Running a Java

Saxonica 11 Oct 23, 2022