On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation

Last update: Oct 24, 2022

Related tags

Overview

On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation

On Nonlinear Latent Transformations for GAN-based Image Editing Valentin Khrulkov, Leyla Mirvakhabova, Ivan Oseledets, Artem Babenko

Overview

We replace linear shifts commonly used for image editing with a flow of a trainable Neural ODE in the latent space.

w' = NN(w; \theta)

The RHS of this Neural ODE is trained end-to-end using pre-trained attribute regressors by enforcing

change of the desired attribute;
invariance of remaining attributes.

Installation and usage

Data

Data required to use the code is available at this dropbox link (2.5Gb).

Path	Description
data	data hosted on Dropbox
├ `models`	pretrained GAN models and attribute regressors
├ `log`	pretrained nonlinear edits (Neural ODEs of depth 1) for a variety of attributes on CUB, FFHQ, Places2
├ `data_to_rectify`	100,000 precomputed pairs `(w, R[G[w]])`; i.e., style vectors and corresponding semantic attributes
├ `configs`	parameters of StyleGAN 2 generators for each dataset (`n_mlp`, `channel_width`, etc)
└ `inverses`	precomputed inverses (elements of W-plus) for sample `FFHQ` images

To download and unpack the data run get_data.sh.

Training

We used torch 1.7 for training; however, the code should work for lower versions as well. An example training script to rectify all the attributes:

CUDA_VISIBLE_DEVICES=0 python train_ode.py --dataset ffhq \
--nb-iter 5000 \
--alpha 8 \
--depth 1

For selected attributes:

CUDA_VISIBLE_DEVICES=0 python train_ode.py --dataset ffhq \
--nb-iter 5000 \
--alpha 8 \
--dir 4 8 15 16 23 32 \
--depth 1

Custom dataset

For training on a custom dataset, you have to provide

Generator and attribute regressor weights
a dictionary {dataset}_all.pt (stored in data_to_rectify). It has the form {"ws": ws, "labels" : labels} with ws being a torch.Tensor of size N x 512 and labels is a torch.Tensor of size N x D, with D being the number of semantic factors. labels should be constructed by evaluating the corresponding attribute regressor on synthetic images generator(ws[i]). It is used to sample batches for training.

Visualization

Please see explore.ipynb for example visualizations. lib.utils.py contains a utility wrapper useful for building and loading the Neural ODE models (FlowFactory).

Restoring from checkpoint

= 1 corresponds to an MLP with depth layers odeblock.load_state_dict(...) # some style vector (generator.style(z)) w0 = ... # You can directly call odeint with torch.no_grad(): odeint(odeblock.odefunc, w0, torch.FloatTensor([0, 1]).to(device)) # Or utilize the wrapper flow = LatentFlow(odefunc=odeblock.odefunc, device=device, name="Bald") flow.flow(w=w0, t=1) # To flow real images: w = torch.load("inverses/actors.pt").to(device) flow.flow(w, t=6, truncate_real=6) # truncate_real specifies which portion of a W-plus vector to modify # (e.g., first 6 our of 14 vectors) ">

import torch
from lib.utils import FlowFactory, LatentFlow
from torchdiffeq import odeint_adjoint as odeint
device = torch.device("cuda")
flow_factory = FlowFactory(dataset="ffhq", device=device)
odeblock = flow_factory._build_odeblock(depth=1)
# depth = -1 corresponds to a constant right hand side (w' = c)
# depth >= 1 corresponds to an MLP with depth layers
odeblock.load_state_dict(...)

# some style vector (generator.style(z))
w0 = ...

# You can directly call odeint
with torch.no_grad():
    odeint(odeblock.odefunc, w0, torch.FloatTensor([0, 1]).to(device))

# Or utilize the wrapper 
flow = LatentFlow(odefunc=odeblock.odefunc, device=device, name="Bald")
flow.flow(w=w0, t=1)

# To flow real images:
w = torch.load("inverses/actors.pt").to(device)
flow.flow(w, t=6, truncate_real=6)
# truncate_real specifies which portion of a W-plus vector to modify
# (e.g., first 6 our of 14 vectors)

A sample script to generate a movie is

CUDA_VISIBLE_DEVICES=0 python make_movie.py --attribute Bald --dataset ffhq

Examples

FFHQ

Bald	Goatee	Wavy_Hair	Arched_Eyebrows

Bangs	Young	Blond_Hair	Chubby

Places2

lush	rugged	fog

Citation

Coming soon.

On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation

Related tags

Overview

On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation

Overview

Installation and usage

Data

Training

Custom dataset

Visualization

Restoring from checkpoint

Examples

FFHQ

Places2

Citation

Credits

Owner

Valentin Khrulkov

Pytorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

A Topic Modeling toolbox

PromptDet: Expand Your Detector Vocabulary with Uncurated Images

Cave Generation using metaballs in Blender. Originally created by sdfgeoff, Edited by Myself (Archie Jaskowicz).

Non-stationary GP package written from scratch in PyTorch

[WACV 2022] Contextual Gradient Scaling for Few-Shot Learning

Dogs classification with Deep Metric Learning using some popular losses

Text and code for the forthcoming second edition of Think Bayes, by Allen Downey.

some academic posters as references. May we have in-person poster session soon!

Hidden-Fold Networks (HFN): Random Recurrent Residuals Using Sparse Supermasks

A static analysis library for computing graph representations of Python programs suitable for use with graph neural networks.

Semantic Edge Detection with Diverse Deep Supervision

Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

Extracting knowledge graphs from language models as a diagnostic benchmark of model performance.

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos

Changing the Mind of Transformers for Topically-Controllable Language Generation

pytorch implementation of the ICCV'21 paper "MVTN: Multi-View Transformation Network for 3D Shape Recognition"

ISBI 2022: Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image.

Python Interview Questions

Library for machine learning stacking generalization.