Instance Semantic Segmentation List

This repository contains lists of state-or-art instance semantic segmentation works. Papers and resources are listed below according to method types.

Paper list
- detection based
- segmentation based

Brief introduction

Instance semantic segmentation is a area closely related to detection and semantic segmentation. In particular, it could be seen as detection plus foreground mask. But mostly it is not able to segment non-object pixels such as sky, land etc(which considered as a scene parsing task under semantic segmentation). For quick review related topics, see these survey papers:

Speed/accuracy trade-offs for modern convolutional object detectors, CVPR 2017

Survey of recent progress in semantic image segmentation with CNNs, until 201706

Dataset and benchmark here

Dataset	Train	Val	Link	Note
Pascal VOC 12 Aug	10582	1449	SegVOC12	origin train 1464+SDB
Pascal VOC SDB val	5623	5732	SDB	similar to VOC12 Main
COCO	115k	5k	COCO	coco_2014_minival
CityScapes	5000	/	CityScapes	evaluation server

Note: Pascal VOC could refer to different split of dataset. Original VOC12 segmentation task consists of train/val/test 1464/1449/1456 images respectively without instance information. It is designed for semantic segmentation. SDB provides instance-aware annotations for images from Pascal VOC12. And their split(8k/3k) differ from VOC, so another split from them Pasval VOC SDB val is provided, which is similar to Pascal VOC Main split(5717/5823).

1.Detection-based methods

PAnet:Path Aggregation Network for Instance Segmentation, CVPR 2018
MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features, CVPR 2018
Mask R-CNN, ICCV 2017 [Detectron][Code(TF)][Code(MXNET)]
BlitzNet: A Real-Time Deep Network for Scene Understanding, ICCV 2017 [web][code]
FCIS: Fully Convolutional Instance-Aware Semantic Segmentation, CVPR 2017 [code]
FastMask: Segment Multi-scale Object Candidates in One Shot, CVPR 2017 [code]
instance sensitive fully convolutional networks, ECCV 2016
MNC: Instance-aware Semantic Segmentation via Multi-task Network Cascades, CVPR 2016 [code]
SharpMask: Learning to Refine Object Segments, ECCV 2016
DeepMask: Learning to Segment Object Candidates, NIPS 2015 [code]

2.Segmentation-based methods

SGN: Sequential Grouping Networks for Instance Segmentation, ICCV 2017
InstanceCut: from Edges to Instances with MultiCut,CVPR 2017
Pixelwise Instance Segmentation with a Dynamically Instantiated Network, CVPR 2017
Deep Watershed Transform for Instance Segmentation, CVPR 2017 [code]
Multi-scale Patch Aggregation (MPA) for Simultaneous Detection and Segmentation, CVPR 2016

3.Others

End-to-End Instance Segmentation with Recurrent Attention, CVPR 2017
Recurrent Instance Segmentation, ECCV 2016 [web]
SDS:Simultaneous Detection and Segmentation, ECCV 2014 [web] [code]
Hypercolumns for Object Segmentation and Fine-grained Localization , CVPR 2015

Instance Semantic Segmentation List

Related tags

Overview

Instance Semantic Segmentation List

Brief introduction

1.Detection-based methods

2.Segmentation-based methods

3.Others

Owner

bighead

RRxIO - Robust Radar Visual/Thermal Inertial Odometry: Robust and accurate state estimation even in challenging visual conditions.

Cards Against Humanity AI

Code for the paper "MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" (Pattern Recognition 2021)

git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

TYolov5: A Temporal Yolov5 Detector Based on Quasi-Recurrent Neural Networks for Real-Time Handgun Detection in Video

FCN (Fully Convolutional Network) is deep fully convolutional neural network architecture for semantic pixel-wise segmentation

[ICRA 2022] CaTGrasp: Learning Category-Level Task-Relevant Grasping in Clutter from Simulation

Awesome Transformers in Medical Imaging

Multiple paper open-source codes of the Microsoft Research Asia DKI group

Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers.

(CVPR 2021) Lifting 2D StyleGAN for 3D-Aware Face Generation

Much faster than SORT(Simple Online and Realtime Tracking), a little worse than SORT

It's a powerful version of linebot

Author's PyTorch implementation of TD3 for OpenAI gym tasks

Official implementation for the paper "Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection"

Dynamic Realtime Animation Control

This repository contains the source codes for the paper AtlasNet V2 - Learning Elementary Structures.

Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback

Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch

TensorFlow implementation of Elastic Weight Consolidation