Invertible conditional GANs for image editing

Related tags

Deep LearningIcGAN
Overview

Invertible Conditional GANs

A real image is encoded into a latent representation z and conditional information y, and then decoded into a new image. We fix z for every row, and modify y for each column to obtain variations in real samples.

This is the implementation of the IcGAN model proposed in our paper:

Invertible Conditional GANs for image editing. November 2016.

This paper is a summarized and updated version of my master thesis, which you can find here:

Master thesis: Invertible Conditional Generative Adversarial Networks. September 2016.

The baseline used is the Torch implementation of the DCGAN by Radford et al.

  1. Training the model
    1. Face dataset: CelebA
    2. Digit dataset: MNIST
  2. Visualize the results
    1. Reconstruct and modify real images
    2. Swap attributes
    3. Interpolate between faces

Requisites

Please refer to DCGAN torch repository to know the requirements and dependencies to run the code. Additionally, you will need to install the threads and optnet package:

luarocks install threads

luarocks install optnet

In order to interactively display the results, follow these steps.

1. Training the model

Model overview

The IcGAN is trained in four steps.

  1. Train the generator.
  2. Create a dataset of generated images with the generator.
  3. Train the encoder Z to map an image x to a latent representation z with the dataset generated images.
  4. Train the encoder Y to map an image x to a conditional information vector y with the dataset of real images.

All the parameters of the training phase are located in cfg/mainConfig.lua.

There is already a pre-trained model for CelebA available in case you want to skip the training part. Here you can find instructions on how to use it.

1.1 Train with a face dataset: CelebA

Note: for speed purposes, the whole dataset will be loaded into RAM during training time, which requires about 10 GB of RAM. Therefore, 12 GB of RAM is a minimum requirement. Also, the dataset will be stored as a tensor to load it faster, make sure that you have around 25 GB of free space.

Preprocess

mkdir celebA; cd celebA

Download img_align_celeba.zip here under the link "Align&Cropped Images". Also, you will need to download list_attr_celeba.txt from the same link, which is found under Anno folder.

unzip img_align_celeba.zip; cd ..
DATA_ROOT=celebA th data/preprocess_celebA.lua

Now move list_attr_celeba.txt to celebA folder.

mv list_attr_celeba.txt celebA

Training

  • Conditional GAN: parameters are already configured to run CelebA (dataset=celebA, dataRoot=celebA).

     th trainGAN.lua
  • Generate encoder dataset:

     net=[GENERATOR_PATH] outputFolder=celebA/genDataset/ samples=182638 th data/generateEncoderDataset.lua

    (GENERATOR_PATH example: checkpoints/celebA_25_net_G.t7)

  • Train encoder Z:

     datasetPath=celebA/genDataset/ type=Z th trainEncoder.lua
    
  • Train encoder Y:

     datasetPath=celebA/ type=Y th trainEncoder.lua
    

1.2 Train with a digit dataset: MNIST

Preprocess

Download MNIST as a luarocks package: luarocks install mnist

Training

  • Conditional GAN:

     name=mnist dataset=mnist dataRoot=mnist th trainGAN.lua
  • Generate encoder dataset:

     net=[GENERATOR_PATH] outputFolder=mnist/genDataset/ samples=60000 th data/generateEncoderDataset.lua

    (GENERATOR_PATH example: checkpoints/mnist_25_net_G.t7)

  • Train encoder Z:

     datasetPath=mnist/genDataset/ type=Z th trainEncoder.lua
    
  • Train encoder Y:

     datasetPath=mnist type=Y th trainEncoder.lua
    

2 Pre-trained CelebA model:

CelebA model is available for download here. The file includes the generator and both encoders (encoder Z and encoder Y).

3. Visualize the results

For visualizing the results you will need an already trained IcGAN (i.e. a generator and two encoders). The parameters for generating results are in cfg/generateConfig.lua.

3.1 Reconstruct and modify real images

Reconstrucion example

decNet=celeba_24_G.t7 encZnet=celeba_encZ_7.t7 encYnet=celeba_encY_5.t7 loadPath=[PATH_TO_REAL_IMAGES] th generation/reconstructWithVariations.lua

3.2 Swap attributes

Swap attributes

Swap the attribute information between two pairs of faces.

decNet=celeba_24_G.t7 encZnet=celeba_encZ_7.t7 encYnet=celeba_encY_5.t7 im1Path=[IM1] im2Path=[IM2] th generation/attributeTransfer.lua

3.3 Interpolate between faces

Interpolation

decNet=celeba_24_G.t7 encZnet=celeba_encZ_7.t7 encYnet=celeba_encY_5.t7 im1Path=[IM1] im2Path=[IM2] th generation/interpolate.lua

Do you like or use our work? Please cite us as

@inproceedings{Perarnau2016,
  author    = {Guim Perarnau and
               Joost van de Weijer and
               Bogdan Raducanu and
               Jose M. \'Alvarez},
  title     = {{Invertible Conditional GANs for image editing}},
  booktitle   = {NIPS Workshop on Adversarial Training},
  year      = {2016},
}
Owner
Guim
Guim
Pose Transformers: Human Motion Prediction with Non-Autoregressive Transformers

Pose Transformers: Human Motion Prediction with Non-Autoregressive Transformers This is the repo used for human motion prediction with non-autoregress

Idiap Research Institute 26 Dec 14, 2022
pytorch implementation of the ICCV'21 paper "MVTN: Multi-View Transformation Network for 3D Shape Recognition"

MVTN: Multi-View Transformation Network for 3D Shape Recognition (ICCV 2021) By Abdullah Hamdi, Silvio Giancola, Bernard Ghanem Paper | Video | Tutori

Abdullah Hamdi 64 Jan 03, 2023
Image data augmentation scheduler for albumentations transforms

albu_scheduler Scheduler for albumentations transforms based on PyTorch schedulers interface Usage TransformMultiStepScheduler import albumentations a

19 Aug 04, 2021
Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

Manav Mishra 4 Apr 15, 2022
Contrastive Learning with Non-Semantic Negatives

Contrastive Learning with Non-Semantic Negatives This repository is the official implementation of Robust Contrastive Learning Using Negative Samples

39 Jul 31, 2022
Multimodal Co-Attention Transformer (MCAT) for Survival Prediction in Gigapixel Whole Slide Images

Multimodal Co-Attention Transformer (MCAT) for Survival Prediction in Gigapixel Whole Slide Images [ICCV 2021] © Mahmood Lab - This code is made avail

Mahmood Lab @ Harvard/BWH 63 Dec 01, 2022
Pytorch implementation of COIN, a framework for compression with implicit neural representations 🌸

COIN 🌟 This repo contains a Pytorch implementation of COIN: COmpression with Implicit Neural representations, including code to reproduce all experim

Emilien Dupont 104 Dec 14, 2022
tinykernel - A minimal Python kernel so you can run Python in your Python

tinykernel - A minimal Python kernel so you can run Python in your Python

fast.ai 37 Dec 02, 2022
Source code for ZePHyR: Zero-shot Pose Hypothesis Rating @ ICRA 2021

ZePHyR: Zero-shot Pose Hypothesis Rating ZePHyR is a zero-shot 6D object pose estimation pipeline. The core is a learned scoring function that compare

R-Pad - Robots Perceiving and Doing 18 Aug 22, 2022
Data Preparation, Processing, and Visualization for MoVi Data

MoVi-Toolbox Data Preparation, Processing, and Visualization for MoVi Data, https://www.biomotionlab.ca/movi/ MoVi is a large multipurpose dataset of

Saeed Ghorbani 51 Nov 27, 2022
Office source code of paper UniFuse: Unidirectional Fusion for 360$^\circ$ Panorama Depth Estimation

UniFuse (RAL+ICRA2021) Office source code of paper UniFuse: Unidirectional Fusion for 360$^\circ$ Panorama Depth Estimation, arXiv, Demo Preparation I

Alibaba 47 Dec 26, 2022
Neighborhood Contrastive Learning for Novel Class Discovery

Neighborhood Contrastive Learning for Novel Class Discovery This repository contains the official implementation of our paper: Neighborhood Contrastiv

Zhun Zhong 56 Dec 09, 2022
The Fundamental Clustering Problems Suite (FCPS) summaries 54 state-of-the-art clustering algorithms, common cluster challenges and estimations of the number of clusters as well as the testing for cluster tendency.

FCPS Fundamental Clustering Problems Suite The package provides over sixty state-of-the-art clustering algorithms for unsupervised machine learning pu

9 Nov 27, 2022
Code for the paper "Learning-Augmented Algorithms for Online Steiner Tree"

Learning-Augmented Algorithms for Online Steiner Tree This is the code for the paper "Learning-Augmented Algorithms for Online Steiner Tree". Requirem

0 Dec 09, 2021
classification task on dataset-CIFAR10,by using Tensorflow/keras

CIFAR10-Tensorflow classification task on dataset-CIFAR10,by using Tensorflow/keras 在这一个库中,我使用Tensorflow与keras框架搭建了几个卷积神经网络模型,针对CIFAR10数据集进行了训练与测试。分别使

3 Oct 17, 2021
the code of the paper: Recurrent Multi-view Alignment Network for Unsupervised Surface Registration (CVPR 2021)

RMA-Net This repo is the implementation of the paper: Recurrent Multi-view Alignment Network for Unsupervised Surface Registration (CVPR 2021). Paper

Wanquan Feng 205 Nov 09, 2022
使用yolov5训练自己数据集(详细过程)并通过flask部署

使用yolov5训练自己的数据集(详细过程)并通过flask部署 依赖库 torch torchvision numpy opencv-python lxml tqdm flask pillow tensorboard matplotlib pycocotools Windows,请使用 pycoc

HB.com 19 Dec 28, 2022
Deep Learning Training Scripts With Python

Deep Learning Training Scripts DNN Frameworks Caffe PyTorch Tensorflow CNN Models VGG ResNet DenseNet Inception Language Modeling GatedCNN-LM Attentio

Multicore Computing Research Lab 16 Dec 15, 2022
Whisper is a file-based time-series database format for Graphite.

Whisper Overview Whisper is one of three components within the Graphite project: Graphite-Web, a Django-based web application that renders graphs and

Graphite Project 1.2k Dec 25, 2022
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation Demo video: CVPR 2021 Oral: Single Channel Manipulation: Localized or attribu

Zongze Wu 267 Dec 30, 2022