TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

Related tags

Deep LearningTransFGU
Overview

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

Zhaoyun Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li, Rong Jin

[Preprint]

Getting Started

Create the environment

# create conda env
conda create -n TransFGU python=3.8
# activate conda env
conda activate TransFGU
# install pytorch
conda install pytorch=1.8 torchvision cudatoolkit=10.1
# install other dependencies
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html
pip install -r requirements.txt

Dataset Preparation

the structure of dataset folders should be as follow:

data/
    │── MSCOCO/
    │     ├── images/
    │     │     ├── train2017/
    │     │     └── val2017/
    │     └── annotations/
    │           ├── train2017/
    │           ├── val2017/
    │           ├── instances_train2017.json
    │           └── instances_val2017.json
    │── Cityscapes/
    │     ├── leftImg8bit/
    │     │     ├── train/
    │     │     │       ├── aachen
    │     │     │       └── ...
    │     │     └──── val/
    │     │             ├── frankfurt
    │     │             └── ...
    │     └── gtFine/
    │           ├── train/
    │           │       ├── aachen
    │           │       └── ...
    │           └──── val/
    │                   ├── frankfurt
    │                   └── ...
    │── PascalVOC/
    │     ├── JPEGImages/
    │     ├── SegmentationClass/
    │     └── ImageSets/
    │           └── Segmentation/
    │                   ├── train.txt
    │                   └── val.txt
    └── LIP/
          ├── train_images/
          ├── train_segmentations/
          ├── val_images/
          ├── val_segmentations/
          ├── train_id.txt
          └── val_id.txt

Model download

Name mIoU Pixel Accuracy Model
COCOStuff-27 16.19 44.52 Google Drive
COCOStuff-171 11.93 34.32 Google Drive
COCO-80 12.69 64.31 Google Drive
Cityscapes 16.83 77.92 Google Drive
Pascal-VOC 37.15 83.59 Google Drive
LIP-5 25.16 65.76 Google Drive
LIP-16 15.49 60.08 Google Drive
LIP-19 12.24 42.52 Google Drive

Train and Evaluate Our Method

To train and evaluate our method on different datasets under desired granularity level, please follow the instructions here.

Citation

If you find our work useful in your research, please consider citing:

@article{yin2021transfgu,
  title={TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation},
  author={Zhaoyun, Yin and Pichao, Wang and Fan, Wang and Xianzhe, Xu and Hanling, Zhang and Hao, Li and Rong, Jin},
  journal={arXiv preprint arXiv:2112.01515},
  year={2021}
}

LICENSE

The code is released under the MIT license.

Copyright

Copyright (C) 2010-2021 Alibaba Group Holding Limited.

Owner
DamoCV
CV team of DAMO academy
DamoCV
Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks

Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks Abstract Facial expression recognition in video

Bogireddy Sai Prasanna Teja Reddy 103 Dec 29, 2022
(CVPR 2022) Energy-based Latent Aligner for Incremental Learning

Energy-based Latent Aligner for Incremental Learning Accepted to CVPR 2022 We illustrate an Incremental Learning model trained on a continuum of tasks

Joseph K J 37 Jan 03, 2023
A disassembler for the RP2040 Programmable I/O State-machine!

piodisasm A disassembler for the RP2040 Programmable I/O State-machine! Usage Just run piodisasm.py on a file that contains the PIO code as hex! (Such

Ghidra Ninja 29 Dec 06, 2022
Evaluation suite for large-scale language models.

This repo contains code for running the evaluations and reproducing the results from the Jurassic-1 Technical Paper (see blog post), with current support for running the tasks through both the AI21 S

71 Dec 17, 2022
Bagua is a flexible and performant distributed training algorithm development framework.

Bagua is a flexible and performant distributed training algorithm development framework.

786 Dec 17, 2022
Contrastive Fact Verification

VitaminC This repository contains the dataset and models for the NAACL 2021 paper: Get Your Vitamin C! Robust Fact Verification with Contrastive Evide

47 Dec 19, 2022
PyTorch implementation of Wide Residual Networks with 1-bit weights by McDonnell (ICLR 2018)

1-bit Wide ResNet PyTorch implementation of training 1-bit Wide ResNets from this paper: Training wide residual networks for deployment using a single

Sergey Zagoruyko 122 Dec 07, 2022
Bi-level feature alignment for versatile image translation and manipulation (Under submission of TPAMI)

Bi-level feature alignment for versatile image translation and manipulation (Under submission of TPAMI) Preparation Clone the Synchronized-BatchNorm-P

Fangneng Zhan 12 Aug 10, 2022
Udacity Suse Cloud Native Foundations Scholarship Course Walkthrough

SUSE Cloud Native Foundations Scholarship Udacity is collaborating with SUSE, a global leader in true open source solutions, to empower developers and

Shivansh Srivastava 34 Oct 18, 2022
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

PyTorch Large-Scale Language Model A Large-Scale PyTorch Language Model trained on the 1-Billion Word (LM1B) / (GBW) dataset Latest Results 39.98 Perp

Ryan Spring 114 Nov 04, 2022
AdamW optimizer and cosine learning rate annealing with restarts

AdamW optimizer and cosine learning rate annealing with restarts This repository contains an implementation of AdamW optimization algorithm and cosine

Maksym Pyrozhok 133 Dec 20, 2022
Gans-in-action - Companion repository to GANs in Action: Deep learning with Generative Adversarial Networks

GANs in Action by Jakub Langr and Vladimir Bok List of available code: Chapter 2: Colab, Notebook Chapter 3: Notebook Chapter 4: Notebook Chapter 6: C

GANs in Action 914 Dec 21, 2022
PromptDet: Expand Your Detector Vocabulary with Uncurated Images

PromptDet: Expand Your Detector Vocabulary with Uncurated Images Paper Website Introduction The goal of this work is to establish a scalable pipeline

103 Dec 20, 2022
Instant neural graphics primitives: lightning fast NeRF and more

Instant Neural Graphics Primitives Ever wanted to train a NeRF model of a fox in under 5 seconds? Or fly around a scene captured from photos of a fact

NVIDIA Research Projects 10.6k Jan 01, 2023
Learned image compression

Overview Pytorch code of our recent work A Unified End-to-End Framework for Efficient Deep Image Compression. We first release the code for Variationa

Jiaheng Liu 163 Dec 04, 2022
PyTorch code accompanying our paper on Maximum Entropy Generators for Energy-Based Models

Maximum Entropy Generators for Energy-Based Models All experiments have tensorboard visualizations for samples / density / train curves etc. To run th

Rithesh Kumar 135 Oct 27, 2022
An updated version of virtual model making

Model-Swap-Face v2   这个项目是基于stylegan2 pSp制作的,比v1版本Model-Swap-Face在推理速度和图像质量上有一定提升。主要的功能是将虚拟模特进行环球不同区域的风格转换,目前转换器提供西欧模特、东亚模特和北非模特三种主流的风格样式,可帮我们实现生产资料零成

seeprettyface.com 62 Dec 09, 2022
Denoising Normalizing Flow

Denoising Normalizing Flow Christian Horvat and Jean-Pascal Pfister 2021 We combine Normalizing Flows (NFs) and Denoising Auto Encoder (DAE) by introd

CHrvt 17 Oct 15, 2022
Release of the ConditionalQA dataset

ConditionalQA Datasets accompanying the paper ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers. Disclaimer This dataset

14 Oct 17, 2022
Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

TXT 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Txt2Xml too

Nguyễn Trường Lâu 4 Nov 24, 2022