PyTorch implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation.

Overview

PyTorch implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation.

Warning: the master branch might collapse. To obtain similar result in README, you can fall back to this commit, but remembered that some ops were not correctly implemented under that commit. Besides, you'd better use a lower learning rate, 1e-4 would be fine.

How to create CelebA-HQ dataset

I borrowed h5tool.py from official code. To create CelebA-HQ dataset, we have to download the original CelebA dataset, and the additional deltas files from here. After that, run

python2 h5tool.py create_celeba_hq file_name_to_save /path/to/celeba_dataset/ /path/to/celeba_hq_deltas

This is what I used on my laptop

python2 h5tool.py create_celeba_hq /Users/yuan/Downloads/CelebA-HQ /Users/yuan/Downloads/CelebA/Original\ CelebA/ /Users/yuan/Downloads/CelebA/CelebA-HQ-Deltas

I found that MD5 checking were always failed, so I just commented out the MD5 checking part(LN 568 and LN 589)

With default setting, it took 1 day on my server. You can specific num_threads and num_tasks for accleration.

Training from scratch

You have to create CelebA-HQ dataset first, please follow the instructions above.

To obtain the similar results in samples directory, see train_no_tanh.py or train.py scipt for details(with default options). Both should work well. For example, you could run

conda create -n pytorch_p36 python=3.6 h5py matplotlib
source activate pytorch_p36
conda install pytorch torchvision -c pytorch
conda install scipy
pip install tensorflow

#0=first gpu, 1=2nd gpu ,2=3rd gpu etc...
python train.py --gpu 0,1,2 --train_kimg 600 --transition_kimg 600 --beta1 0 --beta2 0.99 --gan lsgan --first_resol 4 --target_resol 256 --no_tanh

train_kimg(transition_kimg) means after seeing train_kimg * 1000(transition_kimg * 1000) real images, switching to fade in(stabilize) phase. Currently only support LSGAN and GAN with --no_noise option, since WGAN-GP is unavailable, --drift option does not affect the result. --no_tanh means do not use tanh at generator's output layer.

If you are Python 2 user, You'd better add this to the top of train.py since I use print('something...', file=f) to write experiment settings to file.

from __future__ import print_function

Tensorboard

tensorboard --logdir='./logs'

Update history

  • Update(20171213): Update data.py, now when fading in, real images are weighted combination of current resolution images and 0.5x resolution images. This weighting trick is similar to the one used in Generator's outputs or Discriminator's inputs. This helps stabilize when fading in.

  • Update(20171129): Add restoration mode. Basides, after many trying, I failed to combine BEGAN and PG-GAN. It's removed from the repository.

  • Update(20171124): Now training with CelebA-HQ dataset. Besides, still failing to introduce progressive growing to BEGAN, even with many modifications.

  • Update(20171121): Introduced progressive growing to BEGAN, see train_began.py script. However, experiments showed that it did not work at this moment. Finding bugs and tuning network structure...

  • Update(20171119): Unstable came from resize_activation function, after replacing repeat by torch.nn.functional.upsample, problem solved. And now I believe that both train.py and train_no_tanh should be stable. Restored from 128x128 stabilize, and continued training, currently at 256x256, phase = fade in, temporary results(first 2 columns on the left were generated, and the other 2 columns were taken from dataset):

  • Update(20171118): Making mistake in resize activation function(repeat is not a right in this function), though it's wrong, it's still effective when resolution<256, but collapsed at resolution>=256. Changing it now, scripts will be updated tomorrow. Sorry for this mistake.

  • Update(20171117): 128x128 fade in results(first 2 columns on the left were generated, and the other 2 columns were taken from dataset):

  • Update(20171116): Adding noise only to RGB images might still collapse. Switching to the same trick as the paper suggested. Besides, the paper used linear as activation of G's output layer, which is reasonable, as I observed in the experiments. Temporary results: 64x64, phase=fade in, the left 4 columns are Generated, and the right 4 columns are from real samples(when fading in, instability might occur, for example, the following results is not so promising, however, as the training goes, it gets better), higher resolution will be available soon.

  • Update(20171115): Mode collapse happened when fading in, debugging... => It turns out that unstable seems to be normal when fading in, after some more iterations, it gets better. Now I'm not using the same noise adding trick as the paper suggested, however, it had been implemented, I will test it and plug it into the network.

  • Update(20171114): First version, seems that the generator tends to generate white image. Debugging now. => Fixed some bugs. Now seems normal, training... => There are some unknown problems when fading in, debugging...

  • Update(20171113): Generator and Discriminator: ok, simple test passed.

  • Update(20171112): It's now under reimplementation.

  • Update(20171111): It's still under implementation. I did not care design the structure, and now I had to reimplement(phase='fade in' is hard to implement under current structure). I also fixed some bugs, since reimplementation is needed, I do not plan to pull requests at this moment.

Reference implementation

Pipeline for employing a Lightweight deep learning models for LOW-power systems

PL-LOW A high-performance deep learning model lightweight pipeline that gradually lightens deep neural networks in order to utilize high-performance d

POSTECH Data Intelligence Lab 9 Aug 13, 2022
Encode and decode text application

Text Encoder and Decoder Encode and decode text in many ways using this application! Encode in: ASCII85 Base85 Base64 Base32 Base16 Url MD5 Hash SHA-1

Alice 1 Feb 12, 2022
ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhin et al., 2020).

ReConsider ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhin

Facebook Research 47 Jul 26, 2022
This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Omnimatte in PyTorch This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effect

Erika Lu 728 Dec 28, 2022
Re-TACRED: Addressing Shortcomings of the TACRED Dataset

Re-TACRED Re-TACRED: Addressing Shortcomings of the TACRED Dataset

George Stoica 40 Dec 10, 2022
Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral) This repo is the official imp

如今我已剑指天涯 46 Dec 21, 2022
Emotional conditioned music generation using transformer-based model.

This is the official repository of EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation. The paper has b

hung anna 96 Nov 09, 2022
Code repository for the paper "Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation" with instructions to reproduce the results.

Doubly Trained Neural Machine Translation System for Adversarial Attack and Data Augmentation Languages Experimented: Data Overview: Source Target Tra

Steven Tan 1 Aug 18, 2022
Video Matting via Consistency-Regularized Graph Neural Networks

Video Matting via Consistency-Regularized Graph Neural Networks Project Page | Real Data | Paper Installation Our code has been tested on Python 3.7,

41 Dec 26, 2022
Sound-guided Semantic Image Manipulation - Official Pytorch Code (CVPR 2022)

🔉 Sound-guided Semantic Image Manipulation (CVPR2022) Official Pytorch Implementation Sound-guided Semantic Image Manipulation IEEE/CVF Conference on

CVLAB 58 Dec 28, 2022
Official Pytorch implementation for video neural representation (NeRV)

NeRV: Neural Representations for Videos (NeurIPS 2021) Project Page | Paper | UVG Data Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav S

hao 214 Dec 28, 2022
PuppetGAN - Cross-Domain Feature Disentanglement and Manipulation just got way better! 🚀

Better Cross-Domain Feature Disentanglement and Manipulation with Improved PuppetGAN Quite cool... Right? Introduction This repo contains a TensorFlow

Giorgos Karantonis 5 Aug 25, 2022
Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation Efficient Self-Ensemble Framework for Semantic Segmentation by Walid Bousselham

61 Dec 26, 2022
Code Release for Learning to Adapt to Evolving Domains

EAML Code release for "Learning to Adapt to Evolving Domains" (NeurIPS 2020) Prerequisites PyTorch = 0.4.0 (with suitable CUDA and CuDNN version) tor

23 Dec 07, 2022
Nicholas Lee 3 Jan 09, 2022
Using multidimensional LSTM neural networks to create a forecast for Bitcoin price

Multidimensional LSTM BitCoin Time Series Using multidimensional LSTM neural networks to create a forecast for Bitcoin price. For notes around this co

Jakob Aungiers 318 Dec 14, 2022
Invertible conditional GANs for image editing

Invertible Conditional GANs This is the implementation of the IcGAN model proposed in our paper: Invertible Conditional GANs for image editing. Novemb

Guim 278 Dec 12, 2022
PyTorch implementation DRO: Deep Recurrent Optimizer for Structure-from-Motion

DRO: Deep Recurrent Optimizer for Structure-from-Motion This is the official PyTorch implementation code for DRO-sfm. For technical details, please re

Alibaba Cloud 56 Dec 12, 2022
PyTorch implementations of the beta divergence loss.

Beta Divergence Loss - PyTorch Implementation This repository contains code for a PyTorch implementation of the beta divergence loss. Dependencies Thi

Billy Carson 7 Nov 09, 2022
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Akshita Gupta 54 Nov 21, 2022