A pytorch-based real-time segmentation model for autonomous driving

Last update: Dec 22, 2022

Overview

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation

This project contains the Pytorch implementation for the proposed CFPNet: paper

Real-time semantic segmentation is playing a more important role in computer vision, due to the growing demand for mobile devices and autonomous driving. Therefore, it is very important to achieve a good trade-off among performance, model size and inference speed. In this paper, we propose a Channel-wise Feature Pyramid (CFP) module to balance those factors. Based on the CFP module, we built CFPNet for real-time semantic segmentation which applied a series of dilated convolution channels to extract effective features. Experiments on Cityscapes and CamVid datasets show that the proposed CFPNet achieves an effective combination of those factors. For the Cityscapes test dataset, CFPNet achievse 70.1% class-wise mIoU with only 0.55 million parameters and 2.5 MB memory. The inference speed can reach 30 FPS on a single RTX 2080Ti GPU (GPU usage 60%) with a 1024×2048-pixel image.

Installation

Enviroment: Python 3.6; Pytorch 1.0; CUDA 9.0; cuDNN V7
Install some packages:

pip install opencv-python pillow numpy matplotlib

Clone this repository

git clone https://github.com/AngeLouCN/CFPNet

One GPU with 11GB memory is needed

Dataset

You need to download the two dataset——CamVid and Cityscapes, and put the files in the datasetfolder with following structure.

|—— camvid
|    ├── train
|    ├── test
|    ├── val 
|    ├── trainannot
|    ├── testannot
|    ├── valannot
|    ├── camvid_trainval_list.txt
|    ├── camvid_train_list.txt
|    ├── camvid_test_list.txt
|    └── camvid_val_list.txt
├── cityscapes
|    ├── gtCoarse
|    ├── gtFine
|    ├── leftImg8bit
|    ├── cityscapes_trainval_list.txt
|    ├── cityscapes_train_list.txt
|    ├── cityscapes_test_list.txt
|    └── cityscapes_val_list.txt

Training

You can run: python train.py -hto check the detail of optional arguments. In the train.py, you can set the dataset, train type, epochs and batch size, etc.
training on Cityscapes train set.

python train.py --dataset cityscapes

training on Camvid train and val set.

python train.py --dataset camvid --train_type trainval --max_epochs 1000 --lr 1e-3 --batch_size 16

During training course, every 50 epochs, we will record the mean IoU of train set, validation set and training loss to draw a plot, so you can check whether the training process is normal.

Val mIoU vs Epochs	Train loss vs Epochs

Testing

After training, the checkpoint will be saved at checkpointfolder, you can use test.pyto predict the result.

python test.py --dataset ${camvid, cityscapes} --checkpoint ${CHECKPOINT_FILE}

Evalution

For those dataset that do not provide label on the test set (e.g. Cityscapes), you can use predict.py to save all the output images, then submit to official webpage for evaluation.

python test.py --dataset ${camvid, cityscapes} --checkpoint ${CHECKPOINT_FILE}

Inference Speed

You can run the eval_fps.py to test the model inference speed, input the image size such as 1024,2048.

python eval_fps.py 1024,2048

Results

Results for CFPNet-V1, CFPNet-V2 and CFPNet-v3:

Dataset	Model	mIoU
Cityscapes	CFPNet-V1	60.4%
Cityscapes	CFPNet-V2	66.5%
Cityscapes	CFPNet-V3	70.1%

Sample results: (from top to bottom is Original, CFPNet-V1, CFPNet-V2 and CFPNet-v3)

Category_acc vs size	Class_acc vs size

Class_acc vs parameter	Class_acc vs speed

Comparsion

Results of Cityscapes

Results of CamVid

Citation

If you think our work is helpful, please consider to cite:

@article{lou2021cfpnet,
  title={CFPNet: Channel-wise Feature Pyramid for Real-Time Semantic Segmentation},
  author={Lou, Ange and Loew, Murray},
  journal={arXiv preprint arXiv:2103.12212},
  year={2021}
}

A pytorch-based real-time segmentation model for autonomous driving

Related tags

Overview

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation

Installation

Dataset

Training

Testing

Evalution

Inference Speed

Results

Comparsion

Citation

Owner

An Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models.

Split Variational AutoEncoder

CVPR 2021

RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into tables through jointly extracting intervention, outcome and outcome measure entities and their relations.

TensorFlow-LiveLessons - "Deep Learning with TensorFlow" LiveLessons

Python scripts to detect faces in Python with the BlazeFace Tensorflow Lite models

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Code for the paper "Asymptotics of ℓ2 Regularized Network Embeddings"

Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.

Bayesian algorithm execution (BAX)

Implementation of "JOKR: Joint Keypoint Representation for Unsupervised Cross-Domain Motion Retargeting"

Pytorch-3dunet - 3D U-Net model for volumetric semantic segmentation written in pytorch

A modular active learning framework for Python

Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr.

A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficiently develop and compare their own methods.

FLVIS: Feedback Loop Based Visual Initial SLAM

A keras implementation of ENet (abandoned for the foreseeable future)

Distinguishing Commercial from Editorial Content in News

StyleGAN2 with adaptive discriminator augmentation (ADA) - Official TensorFlow implementation

A curated list of references for MLOps