Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

Last update: Dec 16, 2022

Related tags

Deep Learning StackGAN-v2

Overview

StackGAN-v2

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks by Han Zhang*, Tao Xu*, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas.

Dependencies

python 2.7

Pytorch

In addition, please add the project folder to PYTHONPATH and pip install the following packages:

tensorboard
python-dateutil
easydict
pandas
torchfile

Data

Download our preprocessed char-CNN-RNN text embeddings for birds and save them to data/

[Optional] Follow the instructions reedscot/icml2016 to download the pretrained char-CNN-RNN text encoders and extract text embeddings.

Download the birds image data. Extract them to data/birds/
Download ImageNet dataset and extract the images to data/imagenet/
Download LSUN dataset and save the images to data/lsun

Training

Train a StackGAN-v2 model on the bird (CUB) dataset using our preprocessed embeddings:
- python main.py --cfg cfg/birds_3stages.yml --gpu 0
Train a StackGAN-v2 model on the ImageNet dog subset:
- python main.py --cfg cfg/dog_3stages_color.yml --gpu 0
Train a StackGAN-v2 model on the ImageNet cat subset:
- python main.py --cfg cfg/cat_3stages_color.yml --gpu 0
Train a StackGAN-v2 model on the lsun bedroom subset:
- python main.py --cfg cfg/bedroom_3stages_color.yml --gpu 0
Train a StackGAN-v2 model on the lsun church subset:
- python main.py --cfg cfg/church_3stages_color.yml --gpu 0
*.yml files are example configuration files for training/evaluation our models.
If you want to try your own datasets, here are some good tips about how to train GAN. Also, we encourage to try different hyper-parameters and architectures, especially for more complex datasets.

Pretrained Model

StackGAN-v2 for bird. Download and save it to models/ (The inception score for this Model is 4.04±0.05)
StackGAN-v2 for dog. Download and save it to models/ (The inception score for this Model is 9.55±0.11)
StackGAN-v2 for cat. Download and save it to models/
StackGAN-v2 for bedroom. Download and save it to models/
StackGAN-v2 for church. Download and save it to models/

Evaluating

Run python main.py --cfg cfg/eval_birds.yml --gpu 1 to generate samples from captions in birds validation set.
Change the eval_*.yml files to generate images from other pre-trained models.

Examples generated by StackGAN-v2

Tsne visualization of randomly generated birds, dogs, cats, churchs and bedrooms

Citing StackGAN++

If you find StackGAN useful in your research, please consider citing:

@article{Han17stackgan2,
  author    = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
  title     = {StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks},
  journal   = {arXiv: 1710.10916},
  year      = {2017},
}

@inproceedings{han2017stackgan,
Author = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
Title = {StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks},
Year = {2017},
booktitle = {{ICCV}},
}

Our follow-up work

AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks [Supplementary][code]

References

Generative Adversarial Text-to-Image Synthesis Paper Code
Learning Deep Representations of Fine-grained Visual Descriptions Paper Code

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

Related tags

Overview

StackGAN-v2

Dependencies

Citing StackGAN++

Owner

Han Zhang

This repository is a basic Machine Learning train & validation Template (Using PyTorch)

[ICML 2020] DrRepair: Learning to Repair Programs from Error Messages

Flower - A Friendly Federated Learning Framework

CryptoFrog - My First Strategy for freqtrade

Architecture Patterns with Python (TDD, DDD, EDM)

Supplementary materials to "Spin-optomechanical quantum interface enabled by an ultrasmall mechanical and optical mode volume cavity" by H. Raniwala, S. Krastanov, M. Eichenfield, and D. R. Englund, 2022

[IEEE Transactions on Computational Imaging] Self-Gated Memory Recurrent Network for Efficient Scalable HDR Deghosting

DeepSpamReview: Detection of Fake Reviews on Online Review Platforms using Deep Learning Architectures. Summer Internship project at CoreView Systems.

TensorFlow (Python) implementation of DeepTCN model for multivariate time series forecasting.

Code to train models from "Paraphrastic Representations at Scale".

mPose3D, a mmWave-based 3D human pose estimation model.

[CVPR'21] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

Bayesian dessert for Lasagne

Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

Code for the paper "Benchmarking and Analyzing Point Cloud Classification under Corruptions"

Simple tools for logging and visualizing, loading and training

TextureGAN in Pytorch

Object Depth via Motion and Detection Dataset

Code for unmixing audio signals in four different stems "drums, bass, vocals, others". The code is adapted from "Jukebox: A Generative Model for Music"

Brain Tumor Detection with Tensorflow Neural Networks.