Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2

Last update: Dec 16, 2021

Related tags

Deep Learning Mask2Former

Overview

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2

Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar [arXiv]

Features

A single architecture for three tasks: panoptic, instance and semantic segmentation. This straightforward mini project was built as part of the main project, IST: A TensorFlow 2 compatible instance segmentation toolbox, with the purpose of adapting recent research into segmentation approaches into TensorFlow.
Support common benchmark datasets: ADE20K, Cityscapes, COCO, Mapillary Vistas.

Getting started

Project is currently being built, with SwinTransformerV1 and SwinTransformerV2 and a few bits and pieces ready.

License

Shield:

The majority of MaskFormer is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

However portions of the project are available under separate license terms: Swin-Transformer-Semantic-Segmentation is licensed under the MIT license.

Citation

@article{cheng2021mask2former,
  title={Masked-attention Mask Transformer for Universal Image Segmentation},
  author={Bowen Cheng and Ishan Misra and Alexander G. Schwing and Alexander Kirillov and Rohit Girdhar},
  journal={arXiv},
  year={2021}
}

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2

Related tags

Overview

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2

Features

Getting started

License

Citation

Owner

Phan Nguyen

This is a collection of our NAS and Vision Transformer work.

A Jinja extension (compatible with Flask and other frameworks) to compile and/or compress your assets.

A computational optimization project towards the goal of gerrymandering the results of a hypothetical election in the UK.

A very impractical 3D rendering engine that runs in the python terminal.

Research on Event Accumulator Settings for Event-Based SLAM

Single Image Deraining Using Bilateral Recurrent Network (TIP 2020)

Modular Gaussian Processes

deep learning model with only python and numpy with test accuracy 99 % on mnist dataset and different optimization choices

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

《K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters》(2020)

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

A script written in Python that returns a consensus string and profile matrix of a given DNA string(s) in FASTA format.

Super Pix Adv - Offical implemention of Robust Superpixel-Guided Attentional Adversarial Attack (CVPR2020)

The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

Tackling data scarcity in Speech Translation using zero-shot multilingual Machine Translation techniques

automatic color-grading

NumQMBasic - A mini-course offered to Undergrad physics students

Implementations of CNNs, RNNs, GANs, etc

Code and models for ICCV2021 paper "Robust Object Detection via Instance-Level Temporal Cycle Confusion".

Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.