Implementation of CVPR 2020 Dual Super-Resolution Learning for Semantic Segmentation

Related tags

Deep LearningDSRL
Overview

Dual super-resolution learning for semantic segmentation

2021-01-02 Subpixel Update

Happy new year! The 2020-12-29 update of SISR with subpixel conv performs bad in my experiment so I did some changes to it.

The former subpixel version is depreciated now. Click here to learn more. If you are using the main branch then you can just ignore this message.

2020-12-29 New branch: subpixel

  • In this new branch, SISR path changes to follow the design of Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network, CVPR 2016. The main branch still uses Deconv so if you prefer the older version you can simply ignore this update.
  • I haven't run a full test on this new framework yet so I'm still not sure about it's performance on validation set. Please let me know if you find this new framework performs better. Thank you. :)

2020-12-15 Pretrained Weights Uploaded (Only for the main branch)

  • See Google Drive (Please note that you don't have to unzip this file.)
  • Use the pretrained weights by train.py --resume 'path/to/weights'

2020-10-31 Good News! I achieved an mIoU of 0.6787 in the newest experiment(the experiment is still running and the final mIoU may be even higher)!

  • So the FA module should be places after each path's final output.
  • The FTM should be 19 channel -> 3 channel
  • Hyper-Parameter fine-tuning

It's amazing that the final model converges at a extremely fast speed. Now the codes are all set, just clone this repo and run train.py!

And thanks for the reminder of @XinruiYuan, currently this repo also differs from the original paper in the architecture of SISR path. I will be working on it after finishing my homework.

2020-10-22 First commit

I implemented the framework proposed in this paper since the authors' code is still under legal scan and i just can't wait to see the results. This repo is based on Deeplab v3+ and Cityscapes, and i still have problems about the FA module.

  • so the code is runnable? yes. just run train.py directly and you can see DSRL starts training.(of course change the dataset path. See insturctions in the Deeplab v3+ part below.)

  • any difference from the paper's proposed method? Actually yes. It's mainly about the FA module. I tried several mothods such as:

    • 19 channel SSSR output -> feature transform module -> 3 channel output -> calculate FAloss with 3 channel SISR output. Result is like a disaster
    • 19 channel SSSR last_conv(see the code and you'll know what it is) feature -> feature transform module -> calculate FAloss with 19 channel SISR last_conv feature. still disaster.
    • 19 channel SSSR last_conv(see the code and you'll know what it is) feature -> feature transform module -> calculate FAloss with 19 channel SISR last_conv feature, but no more normalization in the FA module. Seems not bad, but still cannot surpass simple original Deeplab v3+
    • Besides, this project use a square input(default 512*512) which is cropped from the original image.
  • so my results? mIoU about 0.6669 when use the original Deeplab v3+. 0.6638 when i add the SISR path but no FA module. and about 0.62 after i added the FA module.

The result doesn't look good, but this may because of the differences of the FA module.(but why the mIoU decreased after i added the SISR path)

Currently the code doesn't use normalization in FA module. If you want to try using them, please cancel the comment of line 16,18,23,25 in 'utils/fa_loss.py'

Please imform me if you have any questions about the code.

below are discriptions about Deeplab v3+(from the original repo).


pytorch-deeplab-xception

Update on 2018/12/06. Provide model trained on VOC and SBD datasets.

Update on 2018/11/24. Release newest version code, which fix some previous issues and also add support for new backbones and multi-gpu training. For previous code, please see in previous branch

TODO

  • Support different backbones
  • Support VOC, SBD, Cityscapes and COCO datasets
  • Multi-GPU training
Backbone train/eval os mIoU in val Pretrained Model
ResNet 16/16 78.43% google drive
MobileNet 16/16 70.81% google drive
DRN 16/16 78.87% google drive

Introduction

This is a PyTorch(0.4.1) implementation of DeepLab-V3-Plus. It can use Modified Aligned Xception and ResNet as backbone. Currently, we train DeepLab V3 Plus using Pascal VOC 2012, SBD and Cityscapes datasets.

Results

Installation

The code was tested with Anaconda and Python 3.6. After installing the Anaconda environment:

  1. Clone the repo:

    git clone https://github.com/jfzhang95/pytorch-deeplab-xception.git
    cd pytorch-deeplab-xception
  2. Install dependencies:

    For PyTorch dependency, see pytorch.org for more details.

    For custom dependencies:

    pip install matplotlib pillow tensorboardX tqdm

Training

Follow steps below to train your model:

  1. Configure your dataset path in mypath.py.

  2. Input arguments: (see full input arguments via python train.py --help):

    usage: train.py [-h] [--backbone {resnet,xception,drn,mobilenet}]
                [--out-stride OUT_STRIDE] [--dataset {pascal,coco,cityscapes}]
                [--use-sbd] [--workers N] [--base-size BASE_SIZE]
                [--crop-size CROP_SIZE] [--sync-bn SYNC_BN]
                [--freeze-bn FREEZE_BN] [--loss-type {ce,focal}] [--epochs N]
                [--start_epoch N] [--batch-size N] [--test-batch-size N]
                [--use-balanced-weights] [--lr LR]
                [--lr-scheduler {poly,step,cos}] [--momentum M]
                [--weight-decay M] [--nesterov] [--no-cuda]
                [--gpu-ids GPU_IDS] [--seed S] [--resume RESUME]
                [--checkname CHECKNAME] [--ft] [--eval-interval EVAL_INTERVAL]
                [--no-val]
    
  3. To train deeplabv3+ using Pascal VOC dataset and ResNet as backbone:

    bash train_voc.sh
  4. To train deeplabv3+ using COCO dataset and ResNet as backbone:

    bash train_coco.sh

Acknowledgement

PyTorch-Encoding

Synchronized-BatchNorm-PyTorch

drn

Owner
Sam
Get yourself a cup of tea. ˊ_>ˋ旦
Sam
PyG (PyTorch Geometric) - A library built upon PyTorch to easily write and train Graph Neural Networks (GNNs)

PyG (PyTorch Geometric) is a library built upon PyTorch to easily write and train Graph Neural Networks (GNNs) for a wide range of applications related to structured data.

PyG 16.5k Jan 08, 2023
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation models. It contains 17 different amateur subjects performing 30

Aiden Nibali 25 Jun 20, 2021
Advbox is a toolbox to generate adversarial examples that fool neural networks in PaddlePaddle、PyTorch、Caffe2、MxNet、Keras、TensorFlow and Advbox can benchmark the robustness of machine learning models.

Advbox is a toolbox to generate adversarial examples that fool neural networks in PaddlePaddle、PyTorch、Caffe2、MxNet、Keras、TensorFlow and Advbox can benchmark the robustness of machine learning models

AdvBox 1.3k Dec 25, 2022
Code for the paper BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks

Biomedical Entity Linking This repo provides the code for the paper BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Res

Tuan Manh Lai 24 Oct 24, 2022
The dataset of tweets pulling from Twitters with keyword: Hydroxychloroquine, location: US, Time: 2020

HCQ_Tweet_Dataset: FREE to Download. Keywords: HCQ, hydroxychloroquine, tweet, twitter, COVID-19 This dataset is associated with the paper "Understand

2 Mar 16, 2022
WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

Chair for Sys­tems Se­cu­ri­ty 27 Dec 22, 2022
Implementation of the paper "Shapley Explanation Networks"

Shapley Explanation Networks Implementation of the paper "Shapley Explanation Networks" at ICLR 2021. Note that this repo heavily uses the experimenta

68 Dec 27, 2022
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Bae, Gwangbin 138 Dec 28, 2022
Code for the paper Progressive Pose Attention for Person Image Generation in CVPR19 (Oral).

Pose-Transfer Code for the paper Progressive Pose Attention for Person Image Generation in CVPR19(Oral). The paper is available here. Video generation

Tengteng Huang 679 Jan 04, 2023
1st ranked 'driver careless behavior detection' for AI Online Competition 2021, hosted by MSIT Korea.

2021AICompetition-03 본 repo 는 mAy-I Inc. 팀으로 참가한 2021 인공지능 온라인 경진대회 중 [이미지] 운전 사고 예방을 위한 운전자 부주의 행동 검출 모델] 태스크 수행을 위한 레포지토리입니다. mAy-I 는 과학기술정보통신부가 주최하

Junhyuk Park 9 Dec 01, 2022
CLADE - Efficient Semantic Image Synthesis via Class-Adaptive Normalization (TPAMI 2021)

Efficient Semantic Image Synthesis via Class-Adaptive Normalization (Accepted by TPAMI)

tzt 49 Nov 17, 2022
The original weights of some Caffe models, ported to PyTorch.

pytorch-caffe-models This repo contains the original weights of some Caffe models, ported to PyTorch. Currently there are: GoogLeNet (Going Deeper wit

Katherine Crowson 9 Nov 04, 2022
Code for CVPR2021 paper 'Where and What? Examining Interpretable Disentangled Representations'.

PS-SC GAN This repository contains the main code for training a PS-SC GAN (a GAN implemented with the Perceptual Simplicity and Spatial Constriction c

Xinqi/Steven Zhu 40 Dec 16, 2022
DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation

DFFNet Paper DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation. Xiangyan Tang, Wenxuan Tu, Keqiu Li, J

4 Sep 23, 2022
PyTorch implementation of paper: HPNet: Deep Primitive Segmentation Using Hybrid Representations.

HPNet This repository contains the PyTorch implementation of paper: HPNet: Deep Primitive Segmentation Using Hybrid Representations. Installation The

Siming Yan 42 Dec 07, 2022
A system for quickly generating training data with weak supervision

Programmatically Build and Manage Training Data Announcement The Snorkel team is now focusing their efforts on Snorkel Flow, an end-to-end AI applicat

Snorkel Team 5.4k Jan 02, 2023
This repository contains the official code of the paper Equivariant Subgraph Aggregation Networks (ICLR 2022)

Equivariant Subgraph Aggregation Networks (ESAN) This repository contains the official code of the paper Equivariant Subgraph Aggregation Networks (IC

Beatrice Bevilacqua 59 Dec 13, 2022
Reporting and Visualization for Hazardous Events

Reporting and Visualization for Hazardous Events

Jv Kyle Eclarin 2 Oct 03, 2021
Text-to-Music Retrieval using Pre-defined/Data-driven Emotion Embeddings

Text2Music Emotion Embedding Text-to-Music Retrieval using Pre-defined/Data-driven Emotion Embeddings Reference Emotion Embedding Spaces for Matching

Minz Won 50 Dec 05, 2022