Open-source code for Generic Grouping Network (GGN, CVPR 2022)

Overview

Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity

Pytorch implementation for "Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity" (CVPR 2022, link TBD) by Weiyao Wang, Matt Feiszli, Heng Wang, Jitendra Malik, and Du Tran. We propose a framework for open-world instance segmentation, Generic Grouping Network (GGN), which exploits pseudo Ground Truth training strategy. On the same backbone, GGN produces impressive AR gains compared to closed-world training on cross-category generalization (+11% VOC to Non-VOC) and cross-dataset generalization (+5.2% COCO to UVO).

What is it? Open-world instance segmentation requires a model to group pixels into object instances without a pre-defined taxonomy, that is, both "seen" categories (those present during training) and "unseen" categories (not seen during training). There is generally a large performance gap between the seen and unseen domains. For example, a baseline Mask R-CNN miss 15 annotated masks in the example below. Without additional training data or annotations, Mask R-CNN trained with GGN framework produces 9 more segments correctly, being much closer to ground truth annotations.

How we do it? Our approach first learns a pairwise affinity predictor that captures correctly if two pixels belong to same instance or not. We demonstrate such pairwise affinity representation generalizes well to unseen domains. We then use a grouping module (e.g. MCG) to extract and rank segments from predicted PA. We can run this on any image dataset without using annotations; we extract highest ranked segments as "pseudo ground truth" candidate masks. This is a large and category-agnostic set; we add it to our (much smaller) datasets of curated annotations to train a detector.


About the code. This repo is built based on mmdetection with the addition of OLN backbone (concurrent work). The repo is tested under Python 3.7, PyTorch 1.7.0, Cuda 11.0, and mmcv==1.2.5. We thank authors of OLN for releasing their work to facilitate research.

Model zoo

Below we release PA predictor models, pseudo-GT generated by PA predictors and GGN trained with both annotated-GT and pseudo-GT. We also release some of the processed annotations from LVIS to conduct cross-category generalization experiments.

Training Eval url Baseline AR GGN AR Top-K Pseudo
Person, COCO Non-Person, COCO PA/Pseudo/GGN 4.9 20.9 3
VOC, COCO Non-VOC, COCO PA/Pseudo/Pseudo-OLN/ GGN/GGN-OLN 19.9 28.7 (33.7 with OLN) 3
COCO, LVIS Non-COCO, LVIS PA/Pseudo/GGN 16.5 20.4 1
Non-COCO, LVIS COCO PA/Pseudo/GGN 21.7 23.6 1
COCO UVO PA/Pseudo/GGN 40.1 43.4 3
COCO, random init ImageNet PA/Pseudo/GGN 10

We remark using large-scale pre-training in the last row as initialization and finetune GGN on COCO with pseudo-GT on COCO gives further improvement (45.3 on UVO), with model.

Installation

This repo is built based on mmdetection.

You can use following commands to create conda env with related dependencies.

conda create -n ggn python=3.7 -y
conda activate ggn
conda install pytorch=1.7.0 torchvision cudatoolkit=11.0 -c pytorch -y
pip install mmcv-full
pip install -r requirements.txt
pip install -v -e .

Please also refer to get_started.md for more details of installation.

Next you will need to build the library for our grouping module:

cd pa_lib/cython_lib
python3 setup.py build_ext --inplace

Data Preparation

Download and extract COCO 2017 train and val images with annotations from http://cocodataset.org. We expect the directory structure to be the following:

path/to/coco/
  annotations/  # annotation json files
  train2017/    # train images
  val2017/      # val images

Our work also uses LVIS, UVO and ADE20K. To use ADE20K, please convert them into COCO-style annotations.

Training of pairwise affinity predictor

bash tools/dist_train.sh configs/pairwise_affinity/pa_train.py ${NUM_GPUS} --work-dir ${WORK_DIR}

Test PA

We provide a tool tools/test_pa.py to directly evaluate PA performance (e.g. on PA prediction and on grouped masks).

python tools/test_pa.py configs/pairwise_affinity/pa_train.py ${WORK_DIR}/latest.pth --eval pa --eval-proposals --test-partition nonvoc

Extracting pseudo-GT masks

We first begin by extracting masks. Example config pa_extract.py extracts pseudo-GT masks from PA trained on VOC subsets of COCO. use-gt-masks flag asks the pipeline to compute maximum IoU an extracted masks has with the GT. It is recommended to split the dataset into multiple shards to run extractions. On original image resolution and Nvidia V100 machine, it takes about 4.8s per image to run the full pipeline (compute PA, run grouping, ranking then compute IoU with annotated GT) without globalization and trained ranker or 10s with globalization and trained ranker.

python tools/extract_pa_masks.py configs/pairwise_affinity/pa_extract.py ${PA_MODEL_PATH} --out ${OUT_DIR}/masks.json --use-gt-masks 1

The extracted masks will be stored in JSON with the following format

[
  [segm1, segm2,..., segm20] ## Result of an image
  ...
]

We refer to tools/merge_annotations.py for reference on formatting the extracted masks as a new COCO-style annotation file. We remark that tools/interpolate_extracted_masks.py may be necessary if not running extraction on original image resolution.

Training of GGN

Please specify additional_ann_file with the extracted pseudo-GT in previous step in class_agn_mask_rcnn_pa.py.

bash tools/dist_train.sh configs/mask_rcnn/class_agn_mask_rcnn_pa.py ${NUM_GPUS}

class_agn_mask_rcnn_gn_online.py is used to train ImageNet extracted masks since there are too many annotations and we cannot store everything in a single json file without OOM. We will need to break it into per-image annotations in the format of "{image_id}.json".

Testing

python tools/test.py configs/mask_rcnn/class_agn_mask_rcnn.py ${WORK_DIR}/latest.pth --eval segm

To cite this work

@article{wang2022ggn,
  title={Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity},
  author={Wang, Weiyao and Feiszli, Matt and Wang, Heng and Malik, Jitendra and Tran, Du},
  journal={CVPR},
  year={2022}
}

License

This project is under the CC-BY-NC 4.0 license. See LICENSE for details.

Owner
Meta Research
Meta Research
Benchmarking the robustness of Spatial-Temporal Models

Benchmarking the robustness of Spatial-Temporal Models This repositery contains the code for the paper Benchmarking the Robustness of Spatial-Temporal

Yi Chenyu Ian 15 Dec 16, 2022
AFLNet: A Greybox Fuzzer for Network Protocols

AFLNet: A Greybox Fuzzer for Network Protocols AFLNet is a greybox fuzzer for protocol implementations. Unlike existing protocol fuzzers, it takes a m

626 Jan 06, 2023
Code for "The Box Size Confidence Bias Harms Your Object Detector"

The Box Size Confidence Bias Harms Your Object Detector - Code Disclaimer: This repository is for research purposes only. It is designed to maintain r

Johannes G. 24 Dec 07, 2022
Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Moustafa Meshry 16 Oct 05, 2022
Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

Abdultawwab Safarji 7 Nov 27, 2022
Multi-Task Deep Neural Networks for Natural Language Understanding

New Release We released Adversarial training for both LM pre-training/finetuning and f-divergence. Large-scale Adversarial training for LMs: ALUM code

Xiaodong 2.1k Dec 30, 2022
Automatically creates genre collections for your Plex media

Plex Auto Genres Plex Auto Genres is a simple script that will add genre collection tags to your media making it much easier to search for genre speci

Shane Israel 63 Dec 31, 2022
[CIKM 2021] Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning

Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning. This repo contains the PyTorch code and implementation for the paper E

Akuchi 18 Dec 22, 2022
MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieva

Introduction This is the source code of our TCSVT 2021 paper "MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieval". Ple

7 Aug 24, 2022
Deep-Learning-Book-Chapter-Summaries - Attempting to make the Deep Learning Book easier to understand.

Deep-Learning-Book-Chapter-Summaries This repository provides a summary for each chapter of the Deep Learning book by Ian Goodfellow, Yoshua Bengio an

Aman Dalmia 1k Dec 27, 2022
PyTorch implementation of Pay Attention to MLPs

gMLP PyTorch implementation of Pay Attention to MLPs. Quickstart Clone this repository. git clone https://github.com/jaketae/g-mlp.git Navigate to th

Jake Tae 34 Dec 13, 2022
PyTorch implementations of Generative Adversarial Networks.

This repository has gone stale as I unfortunately do not have the time to maintain it anymore. If you would like to continue the development of it as

Erik Linder-Norén 13.4k Jan 08, 2023
The Face Mask recognition system uses AI technology to detect the person with or without a mask.

Face Mask Detection Face Mask Detection system built with OpenCV, Keras/TensorFlow using Deep Learning and Computer Vision concepts in order to detect

Rohan Kasabe 4 Apr 05, 2022
ProjectOxford-ClientSDK - This repo has moved :house: Visit our website for the latest SDKs & Samples

This project has moved 🏠 We heard your feedback! This repo has been deprecated and each project has moved to a new home in a repo scoped by API and p

Microsoft 970 Nov 28, 2022
Statistical-Rethinking-with-Python-and-PyMC3 - Python/PyMC3 port of the examples in " Statistical Rethinking A Bayesian Course with Examples in R and Stan" by Richard McElreath

Statistical Rethinking with Python and PyMC3 This repository has been deprecated in favour of this one, please check that repository for updates, for

Osvaldo Martin 786 Dec 29, 2022
CAUSE: Causality from AttribUtions on Sequence of Events

CAUSE: Causality from AttribUtions on Sequence of Events

Wei Zhang 21 Dec 01, 2022
This repository is dedicated to developing and maintaining code for experiments with wide neural networks.

Wide-Networks This repository contains the code of various experiments on wide neural networks. In particular, we implement classes for abc-parameteri

Karl Hajjar 0 Nov 02, 2021
Official implementation of the paper Do pedestrians pay attention? Eye contact detection for autonomous driving

Do pedestrians pay attention? Eye contact detection for autonomous driving Official implementation of the paper Do pedestrians pay attention? Eye cont

VITA lab at EPFL 26 Nov 02, 2022
Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.

Learning Associative Inference Using Fast Weight Memory This repository contains the offical code for the paper Learning Associative Inference Using F

Imanol Schlag 18 Oct 12, 2022
Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

merged_depth runs (1) AdaBins, (2) DiverseDepth, (3) MiDaS, (4) SGDepth, and (5) Monodepth2, and calculates a weighted-average per-pixel absolute dept

Pranav 39 Nov 21, 2022