PSANet: Point-wise Spatial Attention Network for Scene Parsing, ECCV2018.

Related tags

Deep LearningPSANet
Overview

PSANet: Point-wise Spatial Attention Network for Scene Parsing (in construction)

by Hengshuang Zhao*, Yi Zhang*, Shu Liu, Jianping Shi, Chen Change Loy, Dahua Lin, Jiaya Jia, details are in project page.

Introduction

This repository is build for PSANet, which contains source code for PSA module and related evaluation code. For installation, please merge the related layers and follow the description in PSPNet repository (test with CUDA 7.0/7.5 + cuDNN v4).

PyTorch Version

Highly optimized PyTorch codebases available for semantic segmentation in repo: semseg, including full training and testing codes for PSPNet and PSANet.

Usage

  1. Clone the repository recursively:

    git clone --recursive https://github.com/hszhao/PSANet.git
  2. Merge the caffe layers into PSPNet repository:

    Point-wise spatial attention: pointwise_spatial_attention_layer.hpp/cpp/cu and caffe.proto.

  3. Build Caffe and matcaffe:

    cd $PSANET_ROOT/PSPNet
    cp Makefile.config.example Makefile.config
    vim Makefile.config
    make -j8 && make matcaffe
    cd ..
  4. Evaluation:

    • Evaluation code is in folder 'evaluation'.

    • Download trained models and put them in related dataset folder under 'evaluation/model', refer 'README.md'.

    • Modify the related paths in 'eval_all.m':

      Mainly variables 'data_root' and 'eval_list', and your image list for evaluation should be similarity to that in folder 'evaluation/samplelist' if you use this evaluation code structure.

    cd evaluation
    vim eval_all.m
    • Run the evaluation scripts:
    ./run.sh
    
  5. Results:

    Predictions will show in folder 'evaluation/mc_result' and the expected scores are listed as below:

    (mIoU/pAcc. stands for mean IoU and pixel accuracy, 'ss' and 'ms' denote single scale and multiple scale testing.)

    ADE20K:

    network training data testing data mIoU/pAcc.(ss) mIoU/pAcc.(ms) md5sum
    PSANet50 train val 41.92/80.17 42.97/80.92 a8e884
    PSANet101 train val 42.75/80.71 43.77/81.51 ab5e56

    VOC2012:

    network training data testing data mIoU/pAcc.(ss) mIoU/pAcc.(ms) md5sum
    PSANet50 train_aug val 77.24/94.88 78.14/95.12 d5fc37
    PSANet101 train_aug val 78.51/95.18 79.77/95.43 5d8c0f
    PSANet101 COCO + train_aug + val test -/- 85.7/- 3c6a69

    Cityscapes:

    network training data testing data mIoU/pAcc.(ss) mIoU/pAcc.(ms) md5sum
    PSANet50 fine_train fine_val 76.65/95.99 77.79/96.24 25c06a
    PSANet101 fine_train fine_val 77.94/96.10 79.05/96.30 3ac1bf
    PSANet101 fine_train fine_test -/- 78.6/- 3ac1bf
    PSANet101 fine_train + fine_val fine_test -/- 80.1/- 1dfc91
  6. Demo video:

    • Video processed by PSANet (with PSPNet) on BDD dataset for drivable area segmentation: Video.

Citation

If PSANet is useful for your research, please consider citing:

@inproceedings{zhao2018psanet,
  title={{PSANet}: Point-wise Spatial Attention Network for Scene Parsing},
  author={Zhao, Hengshuang and Zhang, Yi and Liu, Shu and Shi, Jianping and Loy, Chen Change and Lin, Dahua and Jia, Jiaya},
  booktitle={ECCV},
  year={2018}
}

Questions

Please contact '[email protected]' or '[email protected]'

Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection This material is supplementray code for paper accepted in ICDAR 2021 We h

NCSOFT 30 Dec 21, 2022
GNEE - GAT Neural Event Embeddings

GNEE - GAT Neural Event Embeddings This repository contains source code for the GNEE (GAT Neural Event Embeddings) method introduced in the paper: "Se

João Pedro Rodrigues Mattos 0 Sep 15, 2021
Implementation of the state of the art beat-detection, downbeat-detection and tempo-estimation model

The ISMIR 2020 Beat Detection, Downbeat Detection and Tempo Estimation Model Implementation. This is an implementation in TensorFlow to implement the

Koen van den Brink 1 Nov 12, 2021
Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language

Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language This repository contains the code, model, and deployment config

16 Oct 23, 2022
Object tracking implemented with YOLOv4, DeepSort, and TensorFlow.

Object tracking implemented with YOLOv4, DeepSort, and TensorFlow. YOLOv4 is a state of the art algorithm that uses deep convolutional neural networks to perform object detections. We can take the ou

The AI Guy 1.1k Dec 29, 2022
Anomaly Detection Based on Hierarchical Clustering of Mobile Robot Data

We proposed a new approach to detect anomalies of mobile robot data. We investigate each data seperately with two clustering method hierarchical and k-means. There are two sub-method that we used for

Zekeriyya Demirci 1 Jan 09, 2022
Simple and understandable swin-transformer OCR project

swin-transformer-ocr ocr with swin-transformer Overview Simple and understandable swin-transformer OCR project. The model in this repository heavily r

Ha YongWook 67 Dec 31, 2022
[ICCV 2021 Oral] Deep Evidential Action Recognition

DEAR (Deep Evidential Action Recognition) Project | Paper & Supp Wentao Bao, Qi Yu, Yu Kong International Conference on Computer Vision (ICCV Oral), 2

Wentao Bao 80 Jan 03, 2023
Find the Heart simple Python Game

This is a simple Python game for finding a heart emoji. There is a 3 x 3 matrix in which a heart emoji resides. The location of the heart is randomized and is not revealed. The player must guess the

p.katekomol 1 Jan 24, 2022
Lighthouse: Predicting Lighting Volumes for Spatially-Coherent Illumination

Lighthouse: Predicting Lighting Volumes for Spatially-Coherent Illumination Pratul P. Srinivasan, Ben Mildenhall, Matthew Tancik, Jonathan T. Barron,

Pratul Srinivasan 65 Dec 14, 2022
Pytorch Implementation of "Desigining Network Design Spaces", Radosavovic et al. CVPR 2020.

RegNet Pytorch Implementation of "Desigining Network Design Spaces", Radosavovic et al. CVPR 2020. Paper | Official Implementation RegNet offer a very

Vishal R 2 Feb 11, 2022
Can we do Customers Segmentation using PHP and Unsupervized Machine Learning ? Yes we can ! 🤡

Customers Segmentation using PHP and Rubix ML PHP Library Can we do Customers Segmentation using PHP and Unsupervized Machine Learning ? Yes we can !

Mickaël Andrieu 11 Oct 08, 2022
Source code for Zalo AI 2021 submission

zalo_ltr_2021 Source code for Zalo AI 2021 submission Solution: Pipeline We use the pipepline in the picture below: Our pipeline is combination of BM2

128 Dec 27, 2022
Implementation of RegretNet with Pytorch

Dependencies are Python 3, a recent PyTorch, numpy/scipy, tqdm, future and tensorboard. Plotting with Matplotlib. Implementation of the neural network

Horris zhGu 1 Nov 05, 2021
Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow 201 Dec 21, 2022
Resources for our AAAI 2022 paper: "LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification".

LOREN Resources for our AAAI 2022 paper (pre-print): "LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification". DEMO System Check out o

Jiangjie Chen 37 Dec 27, 2022
mlpack: a scalable C++ machine learning library --

a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack

mlpack 4.2k Jan 09, 2023
PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis

FastPitchFormant - PyTorch Implementation PyTorch Implementation of FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis. Qu

Keon Lee 63 Jan 02, 2023
Fewshot-face-translation-GAN - Generative adversarial networks integrating modules from FUNIT and SPADE for face-swapping.

Few-shot face translation A GAN based approach for one model to swap them all. The table below shows our priliminary face-swapping results requiring o

768 Dec 24, 2022
Python with OpenCV - MediaPip Framework Hand Detection

Python HandDetection Python with OpenCV - MediaPip Framework Hand Detection Explore the docs » Contact Me About The Project It is a Computer vision pa

2 Jan 07, 2022