LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

Overview

LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

python-image pytorch-image

Table of Contents:

Introduction

This project contains the code (Note: The code is test in the environment with python=3.6, cuda=9.0, PyTorch-0.4.1, also support Pytorch-0.4.1+) for: LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation by Yu Wang.

The extensive computational burden limits the usage of CNNs in mobile devices for dense estimation tasks, a.k.a semantic segmentation. In this paper, we present a lightweight network to address this problem, namely **LEDNet**, which employs an asymmetric encoder-decoder architecture for the task of real-time semantic segmentation.More specifically, the encoder adopts a ResNet as backbone network, where two new operations, channel split and shuffle, are utilized in each residual block to greatly reduce computation cost while maintaining higher segmentation accuracy. On the other hand, an attention pyramid network (APN) is employed in the decoder to further lighten the entire network complexity. Our model has less than 1M parameters, and is able to run at over 71 FPS on a single GTX 1080Ti GPU card. The comprehensive experiments demonstrate that our approach achieves state-of-the-art results in terms of speed and accuracy trade-off on Cityscapes dataset. and becomes an effective method for real-time semantic segmentation tasks.

Project-Structure

├── datasets  # contains all datasets for the project
|  └── cityscapes #  cityscapes dataset
|  |  └── gtCoarse #  Coarse cityscapes annotation
|  |  └── gtFine #  Fine cityscapes annotation
|  |  └── leftImg8bit #  cityscapes training image
|  └── cityscapesscripts #  cityscapes dataset label convert scripts!
├── utils
|  └── dataset.py # dataloader for cityscapes dataset
|  └── iouEval.py # for test 'iou mean' and 'iou per class'
|  └── transform.py # data preprocessing
|  └── visualize.py # Visualize with visdom 
|  └── loss.py # loss function 
├── checkpoint
|  └── xxx.pth # pretrained models encoder form ImageNet
├── save
|  └── xxx.pth # trained models form scratch 
├── imagenet-pretrain
|  └── lednet_imagenet.py # 
|  └── main.py # 
├── train
|  └── lednet.py  # model definition for semantic segmentation
|  └── main.py # train model scripts
├── test
|  |  └── dataset.py 
|  |  └── lednet.py # model definition
|  |  └── lednet_no_bn.py # Remove the BN layer in model definition
|  |  └── eval_cityscapes_color.py # Test the results to generate RGB images
|  |  └── eval_cityscapes_server.py # generate result uploaded official server
|  |  └── eval_forward_time.py # Test model inference time
|  |  └── eval_iou.py 
|  |  └── iouEval.py 
|  |  └── transform.py 

Installation

  • Python 3.6.x. Recommended using Anaconda3
  • Set up python environment
pip3 install -r requirements.txt
  • Env: PyTorch_0.4.1; cuda_9.0; cudnn_7.1; python_3.6,

  • Clone this repository.

git clone https://github.com/xiaoyufenfei/LEDNet.git
cd LEDNet-master

Datasets

├── leftImg8bit
│   ├── train
│   ├──  val
│   └── test
├── gtFine
│   ├── train
│   ├──  val
│   └── test
├── gtCoarse
│   ├── train
│   ├── train_extra
│   └── val

Training-LEDNet

  • For help on the optional arguments you can run: python main.py -h

  • By default, we assume you have downloaded the cityscapes dataset in the ./data/cityscapes dir.

  • To train LEDNet using the train/main.py script the parameters listed in main.py as a flag or manually change them.

python main.py --savedir logs --model lednet --datadir path/root_directory/  --num-epochs xx --batch-size xx ...

Resuming-training-if-decoder-part-broken

  • for help on the optional arguments you can run: python main.py -h
python main.py --savedir logs --name lednet --datadir path/root_directory/  --num-epochs xx --batch-size xx --decoder --state "../save/logs/model_best_enc.pth.tar"...

Testing

  • the trained models of training process can be found at here. This may not be the best one, you can train one from scratch by yourself or Fine-tuning the training decoder with model encoder pre-trained on ImageNet, For instance
more details refer ./test/README.md

Results

  • Please refer to our article for more details.
Method Dataset Fine Coarse IoU_cla IoU_cat FPS
LEDNet cityscapes yes yes 70.6​% 87.1​%​ 70​+​

qualitative segmentation result examples:

Citation

If you find this code useful for your research, please use the following BibTeX entry.

 @article{wang2019lednet,
  title={LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation},
  author={Wang, Yu and Zhou, Quan and Liu, Jia and Xiong,Jian and Gao, Guangwei and Wu, Xiaofu, and Latecki Jan Longin},
  journal={arXiv preprint arXiv:1905.02423},
  year={2019}
}

Tips

  • Limited by GPU resources, the project results need to be further improved...
  • It is recommended to pre-train Encoder on ImageNet and then Fine-turning Decoder part. The result will be better.

Reference

  1. Deep residual learning for image recognition
  2. Enet: A deep neural network architecture for real-time semantic segmentation
  3. Erfnet: Efficient residual factorized convnet for real-time semantic segmentation
  4. Shufflenet: An extremely efficient convolutional neural network for mobile devices
Owner
Yu Wang
I am a graduate student in CV, my research areas center around computer vision and deep learning.
Yu Wang
improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

CLIP-ViL In our paper "How Much Can CLIP Benefit Vision-and-Language Tasks?", we show the improvement of CLIP features over the traditional resnet fea

310 Dec 28, 2022
Understanding and Overcoming the Challenges of Efficient Transformer Quantization

Transformer Quantization This repository contains the implementation and experiments for the paper presented in Yelysei Bondarenko1, Markus Nagel1, Ti

83 Dec 30, 2022
Malmo Collaborative AI Challenge - Team Pig Catcher

The Malmo Collaborative AI Challenge - Team Pig Catcher Approach The challenge involves 2 agents who can either cooperate or defect. The optimal polic

Kai Arulkumaran 66 Jun 29, 2022
[NeurIPS 2021] The PyTorch implementation of paper "Self-Supervised Learning Disentangled Group Representation as Feature"

IP-IRM [NeurIPS 2021] The PyTorch implementation of paper "Self-Supervised Learning Disentangled Group Representation as Feature". Codes will be relea

Wang Tan 67 Dec 24, 2022
Little tool in python to watch anime from the terminal (the better way to watch anime)

ani-cli Script working again :), thanks to the fork by Dink4n for the alternative approach to by pass the captcha on gogoanime A cli to browse and wat

Harshith 4.5k Dec 31, 2022
A simple editor for captions in .SRT file extension

WaySRT A simple editor for captions in .SRT file extension The program doesn't use any external dependecies, just run: python way_srt.py {file_name.sr

Gustavo Lopes 3 Nov 16, 2022
A unified 3D Transformer Pipeline for visual synthesis

Overview This is the official repo for the paper: NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion. NÜWA is a unified multimodal p

Microsoft 2.6k Jan 06, 2023
VID-Fusion: Robust Visual-Inertial-Dynamics Odometry for Accurate External Force Estimation

VID-Fusion VID-Fusion: Robust Visual-Inertial-Dynamics Odometry for Accurate External Force Estimation Authors: Ziming Ding , Tiankai Yang, Kunyi Zhan

ZJU FAST Lab 86 Nov 18, 2022
Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering

Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering

Meng Liu 2 Jul 19, 2022
This is the official implementation for the paper "(Almost) Free Incentivized Exploration from Decentralized Learning Agents" in NeurIPS 2021.

Observe then Incentivize Experiments This is the code used for the paper "(Almost) Free Incentivized Exploration from Decentralized Learning Agents",

Cong Shen Research Group 0 Mar 08, 2022
A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon.

PokeGAN A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon. Dataset The model has been trained on dataset that includes 8

19 Jul 26, 2022
Multi-Glimpse Network With Python

Multi-Glimpse Network Multi-Glimpse Network: A Robust and Efficient Classification Architecture based on Recurrent Downsampled Attention arXiv Require

9 May 10, 2022
Official implementation of "SinIR: Efficient General Image Manipulation with Single Image Reconstruction" (ICML 2021)

SinIR (Official Implementation) Requirements To install requirements: pip install -r requirements.txt We used Python 3.7.4 and f-strings which are in

47 Oct 11, 2022
Lazy, a tool for running things in idle time

Lazy, a tool for running things in idle time Mostly used to stop CUDA ML model training from making my desktop unusable. Simply monitors keyboard/mous

N Shepperd 46 Nov 06, 2022
Centroid-UNet is deep neural network model to detect centroids from satellite images.

Centroid UNet - Locating Object Centroids in Aerial/Serial Images Introduction Centroid-UNet is deep neural network model to detect centroids from Aer

GIC-AIT 19 Dec 08, 2022
⚾🤖⚾ Automatic baseball pitching overlay in realtime

⚾ Automatically overlaying pitch motion and trajectory with machine learning! This project takes your baseball pitching clips and automatically genera

Tony Chou 240 Dec 05, 2022
Control-Robot-Arm-using-PS4-Controller - A Robotic Arm based on Raspberry Pi and Arduino that controlled by PS4 Controller

Control-Robot-Arm-using-PS4-Controller You can see all details about this Robot

MohammadReza Sharifi 5 Jan 01, 2022
Python Auto-ML Package for Tabular Datasets

Tabular-AutoML AutoML Package for tabular datasets Tabular dataset tuning is now hassle free! Run one liner command and get best tuning and processed

Sagnik Roy 18 Nov 20, 2022
Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

KaGRMN-DSG_ABSA This repository contains the PyTorch source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated

XingBowen 4 May 20, 2022
Img-process-manual - Utilize Python Numpy and Matplotlib to realize OpenCV baisc image processing function

Img-process-manual - Opencv Library basic graphic processing algorithm coding reproduction based on Numpy and Matplotlib library

Jack_Shaw 2 Dec 12, 2022