Official Chainer implementation of GP-GAN: Towards Realistic High-Resolution Image Blending (ACMMM 2019, oral)

Last update: Dec 27, 2022

Overview

GP-GAN: Towards Realistic High-Resolution Image Blending (ACMMM 2019, oral)

[Project] [Paper] [Demo] [Related Work: A2RL (for Auto Image Cropping)] [Colab]
Official Chainer implementation of GP-GAN: Towards Realistic High-Resolution Image Blending

Overview

source	destination	mask	composited	blended

The author's implementation of GP-GAN, the high-resolution image blending algorithm described in:
"GP-GAN: Towards Realistic High-Resolution Image Blending"
Huikai Wu, Shuai Zheng, Junge Zhang, Kaiqi Huang

Given a mask, our algorithm can blend the source image and the destination image, generating a high-resolution and realsitic blended image. Our algorithm is based on deep generative models Wasserstein GAN.

Contact: Hui-Kai Wu ([email protected])

Citation

@article{wu2017gp,
  title   = {GP-GAN: Towards Realistic High-Resolution Image Blending},
  author  = {Wu, Huikai and Zheng, Shuai and Zhang, Junge and Huang, Kaiqi},
  journal = {ACMMM},
  year    = {2019}
}

Getting started

The code is tested with python==3.5 and chainer==6.3.0 on Ubuntu 16.04 LTS.

Download the code from GitHub:

git clone https://github.com/wuhuikai/GP-GAN.git
cd GP-GAN

Install the requirements:

pip install -r requirements/test/requirements.txt

Download the pretrained model blending_gan.npz or unsupervised_blending_gan.npz from Google Drive, and then put them in the folder models.

Run the script for blending_gan.npz:

python run_gp_gan.py --src_image images/test_images/src.jpg --dst_image images/test_images/dst.jpg --mask_image images/test_images/mask.png --blended_image images/test_images/result.png

Or run the script for unsupervised_blending_gan.npz:

python run_gp_gan.py --src_image images/test_images/src.jpg --dst_image images/test_images/dst.jpg --mask_image images/test_images/mask.png --blended_image images/test_images/result.png --supervised False

Type python run_gp_gan.py --help for a complete list of the arguments.

Train GP-GAN step by step

Train Blending GAN

Download Transient Attributes Dataset here.

Crop the images in each subfolder:

python crop_aligned_images.py --data_root [Path for imageAlignedLD in Transient Attributes Dataset]

Train Blending GAN:

python train_blending_gan.py --data_root [Path for cropped aligned images of Transient Attributes Dataset]

Training Curve
Visual Result

Training Set Validation Set

Training Unsupervised Blending GAN

Requirements

pip install git+git://github.com/mila-udem/[email protected]

Download the hdf5 dataset of outdoor natural images: ourdoor_64.hdf5 (1.4G), which contains 150K landscape images from MIT Places dataset.

Train unsupervised Blending GAN:

python train_wasserstein_gan.py --data_root [Path for outdoor_64.hdf5]

Training Curve
Samples after training

Visual results

Mask	Copy-and-Paste	Modified-Poisson	Multi-splines	Supervised GP-GAN	Unsupervised GP-GAN

Official Chainer implementation of GP-GAN: Towards Realistic High-Resolution Image Blending (ACMMM 2019, oral)

Related tags

Overview

GP-GAN: Towards Realistic High-Resolution Image Blending (ACMMM 2019, oral)

Overview

Citation

Getting started

Train GP-GAN step by step

Train Blending GAN

Training Unsupervised Blending GAN

Visual results

Owner

Wu Huikai

PCACE: A Statistical Approach to Ranking Neurons for CNN Interpretability

League of Legends Reinforcement Learning Environment (LoLRLE) multiple training scenarios using PPO.

天勤量化开发包, 期货量化, 实时行情/历史数据/实盘交易

This is the code for HOI Transformer

PyTorch implementation of Tacotron speech synthesis model.

Official code repository for the publication "Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons"

Python scripts for performing stereo depth estimation using the MobileStereoNet model in ONNX

A transformer which can randomly augment VOC format dataset (both image and bbox) online.

Project code for weakly supervised 3D object detectors using wide-baseline multi-view traffic camera data: WIBAM.

Compositional and Parameter-Efficient Representations for Large Knowledge Graphs

Companion repo of the UCC 2021 paper "Predictive Auto-scaling with OpenStack Monasca"

End-To-End Crowdsourcing

The official repository for BaMBNet

Transparent Transformer Segmentation

ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis

PyTorch implementation for "HyperSPNs: Compact and Expressive Probabilistic Circuits", NeurIPS 2021

Code for Multimodal Neural SLAM for Interactive Instruction Following

PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"

Sequence lineage information extracted from RKI sequence data repo

Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.