CVPR 2021: "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE"

Overview

Diverse Structure Inpainting

ArXiv | Papar | Supplementary Material | BibTex

This repository is for the CVPR 2021 paper, "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE".

If our method is useful for your research, please consider citing.

Introduction

(Top) Input incomplete image, where the missing region is depicted in gray. (Middle) Visualization of the generated diverse structures. (Bottom) Output images of our method.

Places2 Results

Results on the Places2 validation set using the center-mask Places2 model.

CelebA-HQ Results

Results on one CelebA-HQ test image with different holes using the random-mask CelebA-HQ model.

Installation

This code was tested with TensorFlow 1.12.0 (later versions may work, excluding 2.x), CUDA 9.0, Python 3.6 and Ubuntu 16.04

Clone this repository:

git clone https://github.com/USTC-JialunPeng/Diverse-Structure-Inpainting.git

Datasets

  • CelebA-HQ: the high-resolution face images from Growing GANs. 24183 images for training, 2993 images for validation and 2824 images for testing.
  • Places2: the challenge data from 365 scene categories. 8 Million images for training, 36K images for validation and 328K images for testing.
  • ImageNet: the data from 1000 natural categories. 1 Million images for training and 50K images for validation.

Training

  • Collect the dataset. For CelebA-HQ, we collect the 1024x1024 version. For Places2 and ImageNet, we collect the original version.
  • Prepare the file list. Collect the path of each image and make a file, where each line is a path (end with a carriage return except the last line).
  • Modify checkpoints_dir, dataset, train_flist and valid_flist arguments in train_vqvae.py, train_structure_generator.py and train_texture_generator.py.
  • Modify data/data_loader.py according to the dataset. For CelebA-HQ, we resize each image to 266x266 and randomly crop a 256x256. For Places2 and ImageNet, we randomly crop a 256x256
  • Run python train_vqvae.py to train VQ-VAE.
  • Modify vqvae_network_dir argument in train_structure_generator.py and train_texture_generator.py based on the path of pre-trained VQ-VAE.
  • Modify the mask setting arguments in train_structure_generator.py and train_texture_generator.py to choose center mask or random mask.
  • Run python train_structure_generator.py to train the structure generator.
  • Run python train_texture_generator.py to train the texture generator.
  • Modify structure_generator_dir and texture_generator_dir arguments in save_full_model.py based on the paths of pre-trained structure generator and texture generator.
  • Run python save_full_model.py to save the whole model.

Testing

  • Collect the testing set. For CelebA-HQ, we resize each image to 256x256. For Places2 and ImageNet, we crop a center 256x256.
  • Collect the corresponding mask set (2D grayscale, 0 indicates the known region, 255 indicates the missing region).
  • Prepare the img file list and the mask file list as training.
  • Modify checkpoints_dir, dataset, img_flist and mask_flist arguments in test.py.
  • Download the pre-trained model and put model.ckpt.meta, model.ckpt.index, model.ckpt.data-00000-of-00001 and checkpoint under model_logs/ directory.
  • Run python test.py

Pre-trained Models

Download the pre-trained models using the following links and put them under model_logs/ directory.

The center_mask models are trained with images of 256x256 resolution with center 128x128 holes. The random_mask models are trained with random regular and irregular holes.

Inference Time

One advantage of GAN-based and VAE-based methods is their fast inference speed. We measure that Mutual Encoder-Decoder with Feature Equalizations runs at 0.2 second per image on a single NVIDIA 1080 Ti GPU for images of resolution 256×256. In contrast, our model runs at 45 seconds per image. Naively sampling our autoregressive network is the major source of computational time. Fortunately, this time can be reduced by an order of magnitude using an incremental sampling technique which caches and reuses intermediate states of the network. Consider using this technique for faster inference.

FlowTorch is a PyTorch library for learning and sampling from complex probability distributions using a class of methods called Normalizing Flows

FlowTorch is a PyTorch library for learning and sampling from complex probability distributions using a class of methods called Normalizing Flows.

Meta Incubator 272 Jan 02, 2023
Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation

OSCAR Project Page | Paper This repository contains the codebase used in OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Ma

NVIDIA Research Projects 74 Dec 22, 2022
A standard framework for modelling Deep Learning Models for tabular data

PyTorch Tabular aims to make Deep Learning with Tabular data easy and accessible to real-world cases and research alike.

801 Jan 08, 2023
InsTrim: Lightweight Instrumentation for Coverage-guided Fuzzing

InsTrim The paper: InsTrim: Lightweight Instrumentation for Coverage-guided Fuzzing Build Prerequisite llvm-8.0-dev clang-8.0 cmake = 3.2 Make git cl

75 Dec 23, 2022
A nutritional label for food for thought.

Lexiscore As a first effort in tackling the theme of information overload in content consumption, I've been working on the lexiscore: a nutritional la

Paul Bricman 34 Nov 08, 2022
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed+Megatron trained the world's most powerful language model: MT-530B DeepSpeed is hiring, come join us! DeepSpeed is a deep learning optimizat

Microsoft 8.4k Dec 28, 2022
Jupyter notebooks for using & learning Keras

deep-learning-with-keras-notebooks 這個github的repository主要是個人在學習Keras的一些記錄及練習。希望在學習過程中發現到一些好的資訊與範例也可以對想要學習使用 Keras來解決問題的同好,或是對深度學習有興趣的在學學生可以有一些方便理解與上手範例

ErhWen Kuo 2.1k Dec 27, 2022
PyTorch GPU implementation of the ES-RNN model for time series forecasting

Fast ES-RNN: A GPU Implementation of the ES-RNN Algorithm A GPU-enabled version of the hybrid ES-RNN model by Slawek et al that won the M4 time-series

Kaung 305 Jan 03, 2023
Implementation of Continuous Sparsification, a method for pruning and ticket search in deep networks

Continuous Sparsification Implementation of Continuous Sparsification (CS), a method based on l_0 regularization to find sparse neural networks, propo

Pedro Savarese 23 Dec 07, 2022
Code for A Volumetric Transformer for Accurate 3D Tumor Segmentation

VT-UNet This repo contains the supported pytorch code and configuration files to reproduce 3D medical image segmentaion results of VT-UNet. Environmen

Himashi Amanda Peiris 114 Dec 20, 2022
Look Who’s Talking: Active Speaker Detection in the Wild

Look Who's Talking: Active Speaker Detection in the Wild Dependencies pip install -r requirements.txt In addition to the Python dependencies, ffmpeg

Clova AI Research 60 Dec 08, 2022
This repository contains the official MATLAB implementation of the TDA method for reverse image filtering

ReverseFilter TDA This repository contains the official MATLAB implementation of the TDA method for reverse image filtering proposed in the paper: "Re

Fergaletto 2 Dec 13, 2021
Course content and resources for the AIAIART course.

AIAIART course This repo will house the notebooks used for the AIAIART course. Part 1 (first four lessons) ran via Discord in September/October 2021.

Jonathan Whitaker 492 Jan 06, 2023
Controlling a game using mediapipe hand tracking

These scripts use the Google mediapipe hand tracking solution in combination with a webcam in order to send game instructions to a racing game. It features 2 methods of control

3 May 17, 2022
The project page of paper: Architecture disentanglement for deep neural networks [ICCV 2021, oral]

This is the project page for the paper: Architecture Disentanglement for Deep Neural Networks, Jie Hu, Liujuan Cao, Tong Tong, Ye Qixiang, ShengChuan

Jie Hu 15 Aug 30, 2022
In Search of Probeable Generalization Measures

In Search of Probeable Generalization Measures Exciting News! In Search of Probeable Generalization Measures has been accepted to the International Co

Mahdi S. Hosseini 6 Sep 11, 2022
PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).

This is the original implementation of our paper, A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem (arXiv:1706.1

Zhengyao Jiang 1.5k Dec 29, 2022
A curated list of references for MLOps

A curated list of references for MLOps

Larysa Visengeriyeva 9.3k Jan 07, 2023
This is a collection of simple PyTorch implementations of neural networks and related algorithms. These implementations are documented with explanations,

labml.ai Deep Learning Paper Implementations This is a collection of simple PyTorch implementations of neural networks and related algorithms. These i

labml.ai 16.4k Jan 09, 2023
A python-image-classification web application project, written in Python and served through the Flask Microframework

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and

Gerald Maduabuchi 19 Dec 12, 2022