[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

Related tags

Deep Learninginsgen
Overview

InsGen - Data-Efficient Instance Generation from Instance Discrimination

image

Data-Efficient Instance Generation from Instance Discrimination
Ceyuan Yang, Yujun Shen, Yinghao Xu, Bolei Zhou
arXiv preprint arXiv: 2106.04566

[Paper] [Project Page]

In this work, we develop a novel data-efficient Instance Generation (InsGen) method for training GANs with limited data. With the instance discrimination as an auxiliary task, our method makes the best use of both real and fake images to train the discriminator. The discriminator in turn guides the generator to synthesize as many diverse images as possible. Experiments under different data regimes show that InsGen brings a substantial improvement over the baseline in terms of both image quality and image diversity, and outperforms previous data augmentation algorithms by a large margin.

Qualitative results

Here we provide some synthesized samples with different numbers of training images and correspoding FID. Full codebase and weights are coming soon. image

Inference

Here, all pretrained models can be downloaded from Google Drive:

Model FID Link
AFHQ512-CAT 2.60 link
AFHQ512-DOG 5.44 link
AFHQ512-WILD 1.77 link
Model FID Link
FFHQ256-2K 11.92 link
FFHQ256-10K 4.90 link
FFHQ256-140K 3.31 link

You can download one of them and put it under MODEL_ZOO directory, then synthesize images via

# Generate AFHQ512-CAT with truncation.
python generate.py --network=${MODEL_ZOO}/afhqcat.pkl \
                   --outdir=${TARGET_DIR} \
                   --trunc=0.7 \
                   --seeds=0-10

Training

This repository is built based on styleGAN2-ada-pytorch. Therefore, please prepare datasets first use dataset_tool.py. On top of Generative Adversarial Networks (GANs), we introduce contrastive loss into the training of discriminator, following MoCo. Concretely, the discriminator is used to extract features from images (either real or synthesized) and then trained with an auxiliary task by distinguishing every individual image.

As described in training/contrastive_head.py, we add two addition heads on top of the original discriminator. These two heads are used to project features extracted from real and fake data onto a unit ball respectively. More details can be found in paper. Note that InsGen can be easily applied to any GAN model by merely introducing two contrastive heads. According to MoCo, the feature extractor should be updated in a momentum manner. Here, in InsGen, the contrastive heads are updated in the forward() function, while the discriminator is updated in training/training_loop.py (see D_ema).

Please use the following command to start your own training:

python train.py --gpus=8 \
                --data=${DATA_PATH} \
                --cfg=paper256 \
                --outdir=training_example

In this example, the results are saved to a created director training_example. --cfg specifies the training configuration, which can be further customized with additional options:

  • --no_insgen disables InsGen, back to original StyleGAN2-ADA.
  • --rqs overrides the number of real image queue size. (default: 5% of the total number of training samples)
  • --fqs overrides the number of fake image queue size. More samples are beneficial, especially when the training samples are limited. (default: 5% of the total number of training samples)
  • --gamma overrides the R1 gamma (i.e., gradient penalty). As described in styleGAN2-ada-pytorch, training can be sensitive to this hyper-parameter. It would be better to try some different values. Here, we recommend using a smaller one than that in original StyleGAN2-ADA.

More functions would be supported after this projest is merged into our genforce. Please stay tuned!

License

This work is made available under the Nvidia Source Code License.

Acknowledgements

We thank Janne Hellsten and Tero Karras for the pytorch version codebase of their styleGAN2-ada-pytorch.

BibTeX

@article{yang2021insgen,
  title   = {Data-Efficient Instance Generation from Instance Discrimination},
  author  = {Yang, Ceyuan and Shen, Yujun and Xu, Yinghao and Zhou, Bolei},
  journal = {arXiv preprint arXiv:2106.04566},
  year    = {2021}
}
Owner
GenForce: May Generative Force Be with You
Research on Generative Modeling in Zhou Group
GenForce: May Generative Force Be with You
PIXIE: Collaborative Regression of Expressive Bodies

PIXIE: Collaborative Regression of Expressive Bodies [Project Page] This is the official Pytorch implementation of PIXIE. PIXIE reconstructs an expres

Yao Feng 331 Jan 04, 2023
The open-source and free to use Python package miseval was developed to establish a standardized medical image segmentation evaluation procedure

miseval: a metric library for Medical Image Segmentation EVALuation The open-source and free to use Python package miseval was developed to establish

59 Dec 10, 2022
[AAAI 2021] EMLight: Lighting Estimation via Spherical Distribution Approximation and [ICCV 2021] Sparse Needlets for Lighting Estimation with Spherical Transport Loss

EMLight: Lighting Estimation via Spherical Distribution Approximation (AAAI 2021) Update 12/2021: We release our Virtual Object Relighting (VOR) Datas

Fangneng Zhan 144 Jan 06, 2023
Pytorch Implementation of Auto-Compressing Subset Pruning for Semantic Image Segmentation

Pytorch Implementation of Auto-Compressing Subset Pruning for Semantic Image Segmentation Introduction ACoSP is an online pruning algorithm that compr

Merantix 8 Dec 07, 2022
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm

Appen Repos 86 Dec 07, 2022
CLOOB training (JAX) and inference (JAX and PyTorch)

cloob-training Pretrained models There are two pretrained CLOOB models in this repo at the moment, a 16 epoch and a 32 epoch ViT-B/16 checkpoint train

Katherine Crowson 64 Nov 27, 2022
The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data This repository provides the implementation details for

124 Dec 27, 2022
Real-Time Seizure Detection using EEG: A Comprehensive Comparison of Recent Approaches under a Realistic Setting

Real-Time Seizure Detection using Electroencephalogram (EEG) This is the repository for "Real-Time Seizure Detection using EEG: A Comprehensive Compar

AITRICS 30 Dec 17, 2022
MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation This repo is the official implementation of "MHFormer: Multi-Hypothesis Transforme

Vegetabird 281 Jan 07, 2023
[NeurIPS 2020] This project provides a strong single-stage baseline for Long-Tailed Classification, Detection, and Instance Segmentation (LVIS).

A Strong Single-Stage Baseline for Long-Tailed Problems This project provides a strong single-stage baseline for Long-Tailed Classification (under Ima

Kaihua Tang 514 Dec 23, 2022
Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

One-Shot Voice Conversion with Weight Adaptive Instance Normalization By Shengjie Huang, Yanyan Xu*, Dengfeng Ke*, Mingjie Chen, Thomas Hain. This rep

31 Dec 07, 2022
Show-attend-and-tell - TensorFlow Implementation of "Show, Attend and Tell"

Show, Attend and Tell Update (December 2, 2016) TensorFlow implementation of Show, Attend and Tell: Neural Image Caption Generation with Visual Attent

Yunjey Choi 902 Nov 29, 2022
Code for Understanding Pooling in Graph Neural Networks

Select, Reduce, Connect This repository contains the code used for the experiments of: "Understanding Pooling in Graph Neural Networks" Setup Install

Daniele Grattarola 37 Dec 13, 2022
CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss

CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss This is official implement of "

程星 87 Dec 24, 2022
Visualizer for neural network, deep learning, and machine learning models

Netron is a viewer for neural network, deep learning and machine learning models. Netron supports ONNX (.onnx, .pb, .pbtxt), Keras (.h5, .keras), Tens

Lutz Roeder 21k Jan 06, 2023
This is an example of object detection on Micro bacterium tuberculosis using Mask-RCNN

Mask-RCNN on Mycobacterium tuberculosis This is an example of object detection on Mycobacterium Tuberculosis using Mask RCNN. Implement of Mask R-CNN

Jun-En Ding 1 Sep 16, 2021
[Nature Machine Intelligence' 21] "Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in Artificial Intelligence"

[UCADI] COVID-19 Diagnosis With Federated Learning Intro We developed a Federated Learning (FL) Framework for global researchers to collaboratively tr

HUST EIC AI-LAB 30 Dec 12, 2022
Data visualization app for H&M competition in kaggle

handm_data_visualize_app Data visualization app by streamlit for H&M competition in kaggle. competition page: https://www.kaggle.com/competitions/h-an

Kyohei Uto 12 Apr 30, 2022
Learning Generative Models of Textured 3D Meshes from Real-World Images, ICCV 2021

Learning Generative Models of Textured 3D Meshes from Real-World Images This is the reference implementation of "Learning Generative Models of Texture

Dario Pavllo 115 Jan 07, 2023
Adversarial Autoencoders

Adversarial Autoencoders (with Pytorch) Dependencies argparse time torch torchvision numpy itertools matplotlib Create Datasets python create_datasets

Felipe Ducau 188 Jan 01, 2023