[NeurIPS2021] Code Release of K-Net: Towards Unified Image Segmentation

Last update: Jan 02, 2023

Overview

K-Net: Towards Unified Image Segmentation

Introduction

This is an official release of the paper K-Net:Towards Unified Image Segmentation. K-Net will also be integrated in the future release of MMDetection and MMSegmentation.

K-Net:Towards Unified Image Segmentation,
Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy
In: Proc. Advances in Neural Information Processing Systems (NeurIPS), 2021
[arXiv][project page][Bibetex]

Results

The results of K-Net and their corresponding configs on each segmentation task are shown as below. We have released the full model zoo of panoptic segmentation. The complete model checkpoints and logs for instance and semantic segmentation will be released soon.

Semantic Segmentation on ADE20K

Backbone	Method	Crop Size	Lr Schd	mIoU	Config	Download
R-50	K-Net + FCN	512x512	80K	43.3	config	model \| log
R-50	K-Net + PSPNet	512x512	80K	43.9	config	model \| log
R-50	K-Net + DeepLabv3	512x512	80K	44.6	config	model \| log
R-50	K-Net + UPerNet	512x512	80K	43.6	config	model \| log
Swin-T	K-Net + UPerNet	512x512	80K	45.4	config	model \| log
Swin-L	K-Net + UPerNet	512x512	80K	52.0	config	model \| log
Swin-L	K-Net + UPerNet	640x640	80K	52.7	config	model \| log

Instance Segmentation on COCO

Backbone	Method	Lr Schd	Mask mAP	Config	Download
R-50	K-Net	1x	34.0	config	model \| log
R-50	K-Net	ms-3x	37.8	config	model \| log
R-101	K-Net	ms-3x	39.2	config	model \| log
R-101-DCN	K-Net	ms-3x	40.5	config	model \| log

Panoptic Segmentation on COCO

Backbone	Method	Lr Schd	PQ	Config	Download
R-50	K-Net	1x	44.3	config	model \| log
R-50	K-Net	ms-3x	47.1	config	model \| log
R-101	K-Net	ms-3x	48.4	config	model \| log
R-101-DCN	K-Net	ms-3x	49.6	config	model \| log
Swin-L (window size 7)	K-Net	ms-3x	54.6	config	model \| log
Above on test-dev			55.2

Installation

It requires the following OpenMMLab packages:

MIM >= 0.1.5
MMCV-full >= v1.3.14
MMDetection >= v2.17.0
MMSegmentation >= v0.18.0
scipy
panopticapi

pip install openmim scipy mmdet mmsegmentation
pip install git+https://github.com/cocodataset/panopticapi.git
mim install mmcv-full

License

This project is released under the Apache 2.0 license.

Usage

Data preparation

Prepare data following MMDetection and MMSegmentation. The data structure looks like below:

data/
├── ade
│   ├── ADEChallengeData2016
│   │   ├── annotations
│   │   ├── images
├── coco
│   ├── annotations
│   │   ├── panoptic_{train,val}2017.json
│   │   ├── instance_{train,val}2017.json
│   │   ├── panoptic_{train,val}2017/  # panoptic png annotations
│   │   ├── image_info_test-dev2017.json  # for test-dev submissions
│   ├── train2017
│   ├── val2017
│   ├── test2017

Training and testing

For training and testing, you can directly use mim to train and test the model

# train instance/panoptic segmentation models
sh ./tools/mim_slurm_train.sh $PARTITION mmdet $CONFIG $WORK_DIR

# test instance segmentation models
sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT --eval segm

# test panoptic segmentation models
sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT --eval pq

# train semantic segmentation models
sh ./tools/mim_slurm_train.sh $PARTITION mmseg $CONFIG $WORK_DIR

# test semantic segmentation models
sh ./tools/mim_slurm_test.sh $PARTITION mmseg $CONFIG $CHECKPOINT --eval mIoU

For test submission for panoptic segmentation, you can use the command below:

# we should update the category information in the original image test-dev pkl file
# for panoptic segmentation
python -u tools/gen_panoptic_test_info.py
# run test-dev submission
sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT  --format-only --cfg-options data.test.ann_file=data/coco/annotations/panoptic_image_info_test-dev2017.json data.test.img_prefix=data/coco/test2017 --eval-options jsonfile_prefix=$WORK_DIR

You can also run training and testing without slurm by directly using mim for instance/semantic/panoptic segmentation like below:

PYTHONPATH='.':$PYTHONPATH mim train mmdet $CONFIG $WORK_DIR
PYTHONPATH='.':$PYTHONPATH mim train mmseg $CONFIG $WORK_DIR

PARTITION: the slurm partition you are using
CHECKPOINT: the path of the checkpoint downloaded from our model zoo or trained by yourself
WORK_DIR: the working directory to save configs, logs, and checkpoints
CONFIG: the config files under the directory configs/
JOB_NAME: the name of the job that are necessary for slurm

Citation

@inproceedings{zhang2021knet,
    title={{K-Net: Towards} Unified Image Segmentation},
    author={Wenwei Zhang and Jiangmiao Pang and Kai Chen and Chen Change Loy},
    year={2021},
    booktitle={NeurIPS},
}

[NeurIPS2021] Code Release of K-Net: Towards Unified Image Segmentation

Related tags

Overview

K-Net: Towards Unified Image Segmentation

Introduction

Results

Semantic Segmentation on ADE20K

Instance Segmentation on COCO

Panoptic Segmentation on COCO

Installation

License

Usage

Data preparation

Training and testing

Citation

Owner

Wenwei Zhang

Code release for Local Light Field Fusion at SIGGRAPH 2019

Full body anonymization - Realistic Full-Body Anonymization with Surface-Guided GANs

Addition of pseudotorsion caclulation eta, theta, eta', and theta' to barnaba package

Object Detection using YOLO from PyImageSearch

Pretrained Cost Model for Distributed Constraint Optimization Problems

A gesture recognition system powered by OpenPose, k-nearest neighbours, and local outlier factor.

Code for "Neural 3D Scene Reconstruction with the Manhattan-world Assumption" CVPR 2022 Oral

DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations

learning and feeling SLAM together with hands-on-experiments

Open-sourcing the Slates Dataset for recommender systems research

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

gtfs2vec - Learning GTFS Embeddings for comparing PublicTransport Offer in Microregions

PyTorch implementation of paper "StarEnhancer: Learning Real-Time and Style-Aware Image Enhancement" (ICCV 2021 Oral)

Comp445 project - Data Communications & Computer Networks

The 3rd place solution for competition

[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

GRF: Learning a General Radiance Field for 3D Representation and Rendering

[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

git《Beta R-CNN: Looking into Pedestrian Detection from Another Perspective》(NeurIPS 2020) GitHub:[fig3]