RTSeg: Real-time Semantic Segmentation Comparative Study

Overview

Real-time Semantic Segmentation Comparative Study

The repository contains the official TensorFlow code used in our papers:

Description

Semantic segmentation benefits robotics related applications especially autonomous driving. Most of the research on semantic segmentation is only on increasing the accuracy of segmentation models with little attention to computationally efficient solutions. The few work conducted in this direction does not provide principled methods to evaluate the     different design choices for segmentation. In RTSeg, we address this gap by presenting a real-time semantic segmentation benchmarking framework with a decoupled design for feature extraction and decoding methods. The code and the experimental results are presented on the CityScapes dataset for urban scenes.



Models

Encoder Skip U-Net DilationV1 DilationV2
VGG-16 Yes Yes Yes No
ResNet-18 Yes Yes Yes No
MobileNet Yes Yes Yes Yes
ShuffleNet Yes Yes Yes Yes

NOTE: The rest of the pretrained weights for all the implemented models will be released soon. Stay in touch for the updates.

Reported Results

Test Set

Model GFLOPs Class IoU Class iIoU Category IoU Category iIoU
SegNet 286.03 56.1 34.2 79.8 66.4
ENet 3.83 58.3 24.4 80.4 64.0
DeepLab - 70.4 42.6 86.4 67.7
SkipNet-VGG16 - 65.3 41.7 85.7 70.1
ShuffleSeg 2.0 58.3 32.4 80.2 62.2
SkipNet-MobileNet 6.2 61.5 35.2 82.0 63.0

Validation Set

Encoder Decoder Coarse mIoU
MobileNet SkipNet No 61.3
ShuffleNet SkipNet No 55.5
ResNet-18 UNet No 57.9
MobileNet UNet No 61.0
ShuffleNet UNet No 57.0
MobileNet Dilation No 57.8
ShuffleNet Dilation No 53.9
MobileNet SkipNet Yes 62.4
ShuffleNet SkipNet Yes 59.3

** GFLOPs is computed on image resolution 360x640. However, the mIOU(s) are computed on the official image resolution required by CityScapes evaluation script 1024x2048.**

** Regarding Inference time, issue is reported here. We were not able to outperform the reported inference time from ENet architecture it could be due to discrepencies in the optimization we perform. People are welcome to improve on the optimization method we're using.

Usage

  1. Download the weights, processed data, and trained meta graphs from here
  2. Extract pretrained_weights.zip
  3. Extract full_cityscapes_res.zip under data/
  4. Extract unet_resnet18.zip under experiments/

Run

The file named run.sh provide a good example for running different architectures. Have a look at this file.

Examples to the running command in run.sh file:

python3 main.py --load_config=[config_file_name].yaml [train/test] [Trainer Class Name] [Model Class Name]
  • Remove comment from run.sh for running fcn8s_mobilenet on the validation set of cityscapes to get its mIoU. Our framework evaluation will produce results lower than the cityscapes evaluation script by small difference, for the final evaluation we use the cityscapes evaluation script. UNet ResNet18 should have 56% on validation set, but with cityscapes script we got 57.9%. The results on the test set for SkipNet-MobileNet and SkipNet-ShuffleNet are publicly available on the Cityscapes Benchmark.
python3 main.py --load_config=unet_resnet18_test.yaml test Train LinkNET
  • To measure running time, run in inference mode.
python3 main.py --load_config=unet_resnet18_test.yaml inference Train LinkNET
  • To run on different dataset or model, take one of the configuration files such as: config/experiments_config/unet_resnet18_test.yaml and modify it or create another .yaml configuration file depending on your needs.

NOTE: The current code does not contain the optimized code for measuring inference time, the final code will be released soon.

Main Dependencies

Python 3 and above
tensorflow 1.3.0/1.4.0
numpy 1.13.1
tqdm 4.15.0
matplotlib 2.0.2
pillow 4.2.1
PyYAML 3.12

All Dependencies

pip install -r [requirements_gpu.txt] or [requirements.txt]

Citation

If you find RTSeg useful in your research, please consider citing our work:

@ARTICLE{2018arXiv180302758S,
   author = {{Siam}, M. and {Gamal}, M. and {Abdel-Razek}, M. and {Yogamani}, S. and
    {Jagersand}, M.},
    title = "{RTSeg: Real-time Semantic Segmentation Comparative Study}",
  journal = {ArXiv e-prints},
archivePrefix = "arXiv",
   eprint = {1803.02758},
 primaryClass = "cs.CV",
 keywords = {Computer Science - Computer Vision and Pattern Recognition},
     year = 2018,
    month = mar,
   adsurl = {http://adsabs.harvard.edu/abs/2018arXiv180302758S},
  adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}

If you find ShuffleSeg useful in your research, please consider citing it as well:

@ARTICLE{2018arXiv180303816G,
   author = {{Gamal}, M. and {Siam}, M. and {Abdel-Razek}, M.},
    title = "{ShuffleSeg: Real-time Semantic Segmentation Network}",
  journal = {ArXiv e-prints},
archivePrefix = "arXiv",
   eprint = {1803.03816},
 primaryClass = "cs.CV",
 keywords = {Computer Science - Computer Vision and Pattern Recognition},
     year = 2018,
    month = mar,
   adsurl = {http://adsabs.harvard.edu/abs/2018arXiv180303816G},
  adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Related Project

Real-time Motion Segmentation using 2-stream shuffleseg Code

Owner
Mennatullah Siam
PhD Student
Mennatullah Siam
Automatic Calibration for Non-repetitive Scanning Solid-State LiDAR and Camera Systems

ACSC Automatic extrinsic calibration for non-repetitive scanning solid-state LiDAR and camera systems. System Architecture 1. Dependency Tested with U

KINO 192 Dec 13, 2022
Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations

Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations Trevor Ablett, Daniel (Yifan) Zhai, Jonatha

STARS Laboratory 3 Feb 01, 2022
[ICCV 2021 Oral] NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo

NerfingMVS Project Page | Paper | Video | Data NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo Yi Wei, Shaohui

Yi Wei 369 Dec 24, 2022
Code for Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations

Implementation for Iso-Points (CVPR 2021) Official code for paper Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations paper |

Yifan Wang 66 Nov 08, 2022
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

HAWQ: Hessian AWare Quantization HAWQ is an advanced quantization library written for PyTorch. HAWQ enables low-precision and mixed-precision uniform

Zhen Dong 293 Dec 30, 2022
This is the repo of the manuscript "Dual-branch Attention-In-Attention Transformer for speech enhancement"

DB-AIAT: A Dual-branch attention-in-attention transformer for single-channel SE

Guochen Yu 68 Dec 16, 2022
Cmsc11 arcade - Final Project for CMSC11

cmsc11_arcade Final Project for CMSC11 Developers: Limson, Mark Vincent Peñafiel

Gregory 1 Jan 18, 2022
Blender Python - Node-based multi-line text and image flowchart

MindMapper v0.8 Node-based text and image flowchart for Blender Mindmap with shortcuts visible: Mindmap with shortcuts hidden: Notes This was requeste

SpectralVectors 58 Oct 08, 2022
Betafold - AlphaFold with tunings

BetaFold We (hegelab.org) craeted this standalone AlphaFold (AlphaFold-Multimer,

2 Aug 11, 2022
[NeurIPS 2021] Galerkin Transformer: a linear attention without softmax

[NeurIPS 2021] Galerkin Transformer: linear attention without softmax Summary A non-numerical analyst oriented explanation on Toward Data Science abou

Shuhao Cao 159 Dec 20, 2022
Anomaly Localization in Model Gradients Under Backdoor Attacks Against Federated Learning

Federated_Learning This repo provides a federated learning framework that allows to carry out backdoor attacks under varying conditions. This is a ker

Arçelik ARGE Açık Kaynak Yazılım Organizasyonu 0 Nov 30, 2021
Learning-Augmented Dynamic Power Management

Learning-Augmented Dynamic Power Management This repository contains source code accompanying paper Learning-Augmented Dynamic Power Management with M

Adam 0 Feb 22, 2022
Neural Contours: Learning to Draw Lines from 3D Shapes (CVPR2020)

Neural Contours: Learning to Draw Lines from 3D Shapes This repository contains the PyTorch implementation for CVPR 2020 Paper "Neural Contours: Learn

93 Dec 16, 2022
A curated list of awesome neural radiance fields papers

Awesome Neural Radiance Fields A curated list of awesome neural radiance fields papers, inspired by awesome-computer-vision. How to submit a pull requ

Yen-Chen Lin 3.9k Dec 27, 2022
The code uses SegFormer for Semantic Segmentation on Drone Dataset.

SegFormer_Segmentation The code uses SegFormer for Semantic Segmentation on Drone Dataset. The details for the SegFormer can be obtained from the foll

Dr. Sander Ali Khowaja 1 May 08, 2022
Convert Python 3 code to CUDA code.

Py2CUDA Convert python code to CUDA. Usage To convert a python file say named py_file.py to CUDA, run python generate_cuda.py --file py_file.py --arch

Yuval Rosen 3 Jul 14, 2021
Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

Kento Nishi 22 Jul 07, 2022
Laplacian Score-regularized Concrete Autoencoders

Laplacian Score-regularized Concrete Autoencoders Requirements: torch = 1.9 scikit-learn = 0.24 omegaconf = 2.0.6 scipy = 1.6.0 matplotlib How to

JS 6 Dec 07, 2022
Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.

Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.

Octavio Arriaga 5.3k Dec 30, 2022
FluxTraining.jl gives you an endlessly extensible training loop for deep learning

A flexible neural net training library inspired by fast.ai

86 Dec 31, 2022