MMSceneGraph

Introduction

MMSceneneGraph is an open source code hub for scene graph generation as well as supporting downstream tasks based on the scene graph on PyTorch. The frontend object detector is supported by open-mmlab/mmdetection.

Major features

Modular design

We decompose the framework into different components and one can easily construct a customized scene graph generation framework by combining different modules.
Support of multiple frameworks out of box

The toolbox directly supports popular and contemporary detection frameworks, e.g. Faster RCNN, Mask RCNN, etc.
Visualization support

The visualization of the groundtruth/predicted scene graph is integrated into the toolbox.

License

This project is released under the MIT license.

Changelog

Please refer to CHANGELOG.md for details.

Benchmark and model zoo

The original object detection results and models provided by mmdetection are available in the model zoo. The models for the scene graph generation are temporarily unavailable yet.

Supported methods and Datasets

Supported SGG (VRD) methods:

Supported saliency object detection methods:

R3Net (IJCAI'2018)
SCRN (ICCV'2019)

Supported image captioning methods:

bottom-up (CVPR'2018)
XLAN (CVPR'2020)

Supported datasets:

Visual Genome: VG150 (CVPR'2017)
VRD (ECCV'2016)
Visual Genome: VG200/VG-KR (ours)
MSCOCO (for object detection, image caption)
RelCap (from VG and COCO, ours)

Installation

As our project is built on mmdetection 1.x (which is a bit different from their current master version 2.x), please refer to INSTALL.md. If you want to use mmdetection 2.x, please refer to mmdetection/get_start.md.

Getting Started

Please refer to GETTING_STARTED.md for using the projects. We will update it constantly.

Acknowledgement

We appreciate the contributors of the mmdetection project and Scene-Graph-Benchmark.pytorch which inspires our design.

Citation

If you find this code hub or our works useful in your research works, please consider citing:

@inproceedings{wang2021topic,
  title={Topic Scene Graph Generation by Attention Distillation from Caption},
  author={Wang, Wenbin and Wang, Ruiping and Chen, Xilin},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  pages={15900--15910},
  month = {October},
  year={2021}
}


@inproceedings{wang2020sketching,
  title={Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation},
  author={Wang, Wenbin and Wang, Ruiping and Shan, Shiguang and Chen, Xilin},
  booktitle={Proceedings of European Conference on Computer Vision (ECCV)},
  pages={222--239},
  year={2020},
  volume={12358},
  doi={10.1007/978-3-030-58601-0_14},
  publisher={Springer}
}

@InProceedings{Wang_2019_CVPR,
author = {Wang, Wenbin and Wang, Ruiping and Shan, Shiguang and Chen, Xilin},
title = {Exploring Context and Visual Pattern of Relationship for Scene Graph Generation},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
pages = {8188-8197},
month = {June},
address = {Long Beach, California, USA},
doi = {10.1109/CVPR.2019.00838},
year = {2019}
}

A brand new hub for Scene Graph Generation methods based on MMdetection (2021). The pipeline of from detection, scene graph generation to downstream tasks (e.g., image cpationing) is supported. Pytorch version implementation of HetH (ECCV 2020) and TopicSG (ICCV 2021) is included.

Related tags

Overview

MMSceneGraph

Introduction

Major features

License

Changelog

Benchmark and model zoo

Supported methods and Datasets

Installation

Getting Started

Acknowledgement

Citation

Owner

Kenneth-Wong

MiniSom is a minimalistic implementation of the Self Organizing Maps

A Protein-RNA Interface Predictor Based on Semantics of Sequences

Open source code for Paper "A Co-Interactive Transformer for Joint Slot Filling and Intent Detection"

Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

PyTorch implementation of a collections of scalable Video Transformer Benchmarks.

Landmarks Recogntion Web application using Streamlit.

Tensor-based approaches for fMRI classification

codes for Self-paced Deep Regression Forests with Consideration on Ranking Fairness

Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement (NeurIPS 2020)

Self-Supervised Contrastive Learning of Music Spectrograms

OCRA (Object-Centric Recurrent Attention) source code

CHERRY is a python library for predicting the interactions between viral and prokaryotic genomes

[ACM MM 2021] Joint Implicit Image Function for Guided Depth Super-Resolution

Quantum-enhanced transformer neural network

SW components and demos for visual kinship recognition. An emphasis is put on the FIW dataset-- data loaders, benchmarks, results in summary.

Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

Hso-groupie - A pwnable challenge in Real World CTF 4th

This project intends to use SVM supervised learning to determine whether or not an individual is diabetic given certain attributes.

VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning