Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks

Related tags

Deep LearningMGANs
Overview

MGANs

Training & Testing code (torch), pre-trained models and supplementary materials for "Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks".

See this video for a quick explaination for our method and results.

Setup

As building Torch with the latest CUDA is a troublesome work, we recommend following the following steps to people who want to reproduce the results: It has been tested on Ubuntu with CUDA 10.

Step One: Install CUDA 10 and CUDNN 7.6.2

If you have a fresh Ubuntu, we recommend Lambda Stack which helps you install the latest drivers, libraries, and frameworks for deep learning. Otherwise, you can install the CUDA toolkit and CUDNN from these links:

Step Two: Install Torch

git clone https://github.com/nagadomi/distro.git ~/torch --recursive
cd ~/torch
./install-deps
./clean.sh
./update.sh

. ~/torch/install/bin/torch-activate
sudo apt-get install libprotobuf-dev protobuf-compiler
luarocks install loadcaffe

Demo

cd code
th demo_MGAN.lua

Training

Simply cd into folder "code/" and run the training script.

th train.lua

The current script is an example of training a network from 100 ImageNet photos and a single painting from Van Gogh. The input data are organized in the following way:

  • "Dataset/VG_Alpilles_ImageNet100/ContentInitial": 5 training ImageNet photos to initialize the discriminator.
  • "Dataset/VG_Alpilles_ImageNet100/ContentTrain": 100 training ImageNet photos.
  • "Dataset/VG_Alpilles_ImageNet100/ContentTest": 10 testing ImageNet photos (for later inspection).
  • "Dataset/VG_Alpilles_ImageNet100/Style": Van Gogh's painting.

The training process has three main steps:

  • Use MDAN to generate training images (MDAN_wrapper.lua).
  • Data Augmentation (AG_wrapper.lua).
  • Train MGAN (MDAN_wrapper.lua).

Testing

The testing process has two steps:

  • Step 1: call "th release_MGAN.lua" to concatenate the VGG encoder with the generator.
  • Step 2: call "th demo_MGAN.lua" to test the network with new photos.

Display

You can use the browser based display package to display the training process for both MDANs and MGANs.

  • Install: luarocks install https://raw.githubusercontent.com/szym/display/master/display-scm-0.rockspec
  • Call: th -ldisplay.start
  • See results at this URL: http://localhost:8000

Example

We chose Van Gogh's "Olive Trees with the Alpilles in the Background" as the reference texture.

We then transfer 100 ImageNet photos into the same style with the proposed MDANs method. MDANs take an iterative deconvolutional approach, which is similar to "A Neural Algorithm of Artistic Style" by Leon A. Gatys et al. and our previous work "CNNMRF". Differently, it uses adversarial training instead of gaussian statistics ("A Neural Algorithm of Artistic Style) or nearest neighbour search "CNNMRF". Here are some transferred results from MDANs:

The results look nice, so we know adversarial training is able to produce results that are comparable to previous methods. In other experiments we observed that gaussian statistics work remarkable well for painterly textures, but can sometimes be too flexible for photorealistic textures; nearest-neighbor search preserve photorealistic details but can be too rigid for deformable textures. In some sense MDANs offers a relatively more balanced choice with advaserial training. See our paper for more discussoins.

Like previous deconvolutional methods, MDANs is VERY slow. A Nvidia Titan X takes about one minute to transfer a photo of 384 squared. To make it faster, we replace the deconvolutional process by a feed-forward network (MGANs). The feed-forward network takes long time to train (45 minutes for this example on a Titan X), but offers significant speed up in testing time. Here are some results from MGANs:

It is our expectation that MGANs will trade quality for speed. The question is: how much? Here are some comparisons between the result of MDANs and MGANs:

In general MDANs (middle) give more stylished results, and does a much better job at homegenous background areas (the last two cases). But sometimes MGANs (right) is able to produce comparable results (the first two).

And MGANs run at least two orders of magnitudes faster.

Final remark

There are concurrent works that try to make deep texture synthesis faster. For example, Ulyanov et al. and Johnson et al. also achieved significant speed up and very nice results with a feed-forward architecture. Both of these two methods used the gaussian statsitsics constraint proposed by Gatys et al.. We believe our method is a good complementary: by changing the gaussian statistics constraint to discrimnative networks trained with Markovian patches, it is possible to model more complex texture manifolds (see discussion in our paper).

Last, here are some prelimiary results of training a MGANs for photorealistic synthesis. It learns from 200k face images from CelebA. The network then transfers VGG_19 encoding (layer ReLU5_1) of new face images (left) into something interesting (right). The synthesized faces have the same poses/layouts as the input faces, but look like different persons :-)

Acknowledgement

Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode

šŸ¤— Transformers Wav2Vec2 + PyCTCDecode Introduction This repo shows how šŸ¤— Transformers can be used in combination with kensho-technologies's PyCTCDec

Patrick von Platen 102 Oct 22, 2022
The implementation of FOLD-R++ algorithm

FOLD-R-PP The implementation of FOLD-R++ algorithm. The target of FOLD-R++ algorithm is to learn an answer set program for a classification task. Inst

13 Dec 23, 2022
ICLR 2021, Fair Mixup: Fairness via Interpolation

Fair Mixup: Fairness via Interpolation Training classifiers under fairness constraints such as group fairness, regularizes the disparities of predicti

Ching-Yao Chuang 49 Nov 22, 2022
Lightweight Python library for adding real-time object tracking to any detector.

Norfair is a customizable lightweight Python library for real-time 2D object tracking. Using Norfair, you can add tracking capabilities to any detecto

Tryolabs 1.7k Jan 05, 2023
MTA:SA Server Configer.

MTAConfiger MTA:SA Server Configer. Hi šŸ‘‹ , I'm Alireza A Python Developer Boy šŸ”­ I’m currently working on my C# projects 🌱 I’m currently Learning CS

3 Jun 07, 2022
Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning.

Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning. Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive

<a href=[email protected](SZ)"> 7 Dec 16, 2021
STEAL - Learning Semantic Boundaries from Noisy Annotations (CVPR 2019)

STEAL This is the official inference code for: Devil Is in the Edges: Learning Semantic Boundaries from Noisy Annotations David Acuna, Amlan Kar, Sanj

469 Dec 26, 2022
[ICCV 2021] Group-aware Contrastive Regression for Action Quality Assessment

CoRe Created by Xumin Yu*, Yongming Rao*, Wenliang Zhao, Jiwen Lu, Jie Zhou This is the PyTorch implementation for ICCV paper Group-aware Contrastive

Xumin Yu 31 Dec 24, 2022
PyTorch code for 'Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning'

Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning This repository is for EMSRDPN introduced in the foll

7 Feb 10, 2022
Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network An official PyTorch implementation of the RBSRICNN network as desc

Rao Muhammad Umer 6 Nov 14, 2022
People movement type classifier with YOLOv4 detection and SORT tracking.

Movement classification The goal of this project would be movement classification of people, in other words, walking (normal and fast) and running. Yo

4 Sep 21, 2021
Deep Ensemble Learning with Jet-Like architecture

Ransomware analysis using DEL with jet-like architecture comprising two CNN wings, a sparse AE tail, a non-linear PCA to produce a diverse feature space, and an MLP nose

Ahsen Nazir 2 Feb 06, 2022
Implementation of the paper Scalable Intervention Target Estimation in Linear Models (NeurIPS 2021), and the code to generate simulation results.

Scalable Intervention Target Estimation in Linear Models Implementation of the paper Scalable Intervention Target Estimation in Linear Models (NeurIPS

0 Oct 25, 2021
Computer Vision is an elective course of MSAI, SCSE, NTU, Singapore

[AI6122] Computer Vision is an elective course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6122 of Semester 1, AY2021-2022, starting from 08/2021. The instructor of this course

HT. Li 5 Sep 12, 2022
HAT: Hierarchical Aggregation Transformers for Person Re-identification

HAT: Hierarchical Aggregation Transformers for Person Re-identification

11 Sep 05, 2022
Indices Matter: Learning to Index for Deep Image Matting

IndexNet Matting This repository includes the official implementation of IndexNet Matting for deep image matting, presented in our paper: Indices Matt

Hao Lu 357 Nov 26, 2022
Classical OCR DCNN reproduction based on PaddlePaddle framework.

Paddle-SVHN Classical OCR DCNN reproduction based on PaddlePaddle framework. This project reproduces Multi-digit Number Recognition from Street View I

1 Nov 12, 2021
Official Implementation of SWAD (NeurIPS 2021)

SWAD: Domain Generalization by Seeking Flat Minima (NeurIPS'21) Official PyTorch implementation of SWAD: Domain Generalization by Seeking Flat Minima.

Junbum Cha 97 Dec 20, 2022
Implementation of our paper 'RESA: Recurrent Feature-Shift Aggregator for Lane Detection' in AAAI2021.

RESA PyTorch implementation of the paper "RESA: Recurrent Feature-Shift Aggregator for Lane Detection". Our paper has been accepted by AAAI2021. Intro

137 Jan 02, 2023
Official implementation of ACMMM'20 paper 'Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework'

Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework Official code for paper, Self-supervised Video Representation Le

Li Tao 103 Dec 21, 2022