An efficient toolkit for Face Stylization based on the paper "AgileGAN: Stylizing Portraits by Inversion-Consistent Transfer Learning"

Overview

MMGEN-FaceStylor

English | 简体中文

Introduction

This repo is an efficient toolkit for Face Stylization based on the paper "AgileGAN: Stylizing Portraits by Inversion-Consistent Transfer Learning". We note that since the training code of AgileGAN is not released yet, this repo merely adopts the pipeline from AgileGAN and combines other helpful practices in this literature.

This project is based on MMCV and MMGEN, star and fork is welcomed 🤗 !

Results from FaceStylor trained by MMGEN

Requirements

  • CUDA 10.0 / CUDA 10.1
  • Python 3
  • PyTorch >= 1.6.0
  • MMCV-Full >= 1.3.15
  • MMGeneration >= 0.3.0

Setup

Step-1: Create an Environment

First, we should build a conda virtual environment and activate it.

conda create -n facestylor python=3.7 -y
conda activate facestylor

Suppose you have installed CUDA 10.1, you need to install the prebuilt PyTorch with CUDA 10.1.

conda install pytorch=1.6.0 cudatoolkit=10.1 torchvision -c pytorch
pip install requirements.txt

Step-2: Install MMCV and MMGEN

We can run the following command to install MMCV.

pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html

Of course, you can also refer to the MMCV Docs to install it.

Next, we should install MMGEN containing the basic generative models that will be used in this project.

# Clone the MMGeneration repository.
git clone https://github.com/open-mmlab/mmgeneration.git
cd mmgeneration
# Install build requirements and then install MMGeneration.
pip install -r requirements.txt
pip install -v -e .  # or "python setup.py develop"
cd ..

Step-3: Clone repo and prepare the data and weights

Now, we need to clone this repo first.

git clone https://github.com/open-mmlab/MMGEN-FaceStylor.git

For convenience, we suggest that you make these folders under MMGEN-FaceStylor.

cd MMGEN-FaceStylor
mkdir data
mkdir work_dirs
mkdir work_dirs/experiments
mkdir work_dirs/pre-trained

Then, you can put or create the soft-link for your data under data folder, and store your experiments under work_dirs/experiments.

For testing and training, you need to download some necessary data provided by AgileGAN and put them under data folder. Or just run this:

wget --no-check-certificate 'https://docs.google.com/uc?export=download&id=1AavRxpZJYeCrAOghgtthYqVB06y9QJd3' -O data/shape_predictor_68_face_landmarks.dat

We also provide some pre-trained weights.

Pre-trained Weights
FFHQ-1024 StyleGAN2
FFHQ-256 StyleGAN2
IR-SE50 Model
Encoder for FFHQ-1024 StyleGAN2
Encoder for FFHQ-256 StyleGAN2
MetFace-Oil 1024 StyleGAN2
MetFace-Sketch 1024 StyleGAN2
Toonify 1024 StyleGAN2
Cartoon 256
Bitmoji 256
Comic 256
More Styles on the Way!

Play with MMGEN-FaceStylor

If you have followed the aforementioned steps, we can start to investigate FaceStylor!

Quick Try

To quickly try our project, please run the command below

python demo/quick_try.py demo/src.png --style toonify

Then, you can check the result in work_dirs/demos/agile_result.png.

  • If you want to play with your own photos, you can replace demo/src.png with your photo.
  • If you want to switch to another style, change toonify with other styles. Now, supported styles include toonify, oil, sketch, bitmoji, cartoon, comic.

Inversion

The inversion task will adopt a source image as input and return the most similar image that can be generated by the generator model.

For inversion, you can directly use agilegan_demo like this

python demo/agilegan_demo.py SOURCE_PATH CONFIG [--ckpt CKPT] [--device DEVICE] [--save-path SAVE_PATH]

Here, you should set SOURCE_PATH to your image path, CONFIG to the config file path, and CKPT to checkpoint path.

Take Celebahq-Encoder as an example, you need to download the weights to work_dirs/pre-trained/agile_encoder_celebahq1024x1024_lr_1e-4_150k.pth, put your test image under data run

python demo/agilegan_demo.py demo/src.png configs/agilegan/agile_encoder_celebahq1024x1024_lr_1e-4_150k.py --ckpt work_dirs/pre-trained/agile_encoder_celebahq_lr_1e-4_150k.pth

You will find the result work_dirs/demos/agile_result.png.

Stylization

Since the encoder and decoder of stylization can be trained from different configs, you're supposed to set their ckpts' path in config file. Take Metface-oil as an example, you can see the first two lines in config file.

encoder_ckpt_path = xxx
stylegan_weights = xxx

You should keep your actual weights path in line with your configs. Then run the same command without specifying CKPT.

python demo/agilegan_demo.py SOURCE_PATH CONFIG [--device DEVICE] [--save-path SAVE_PATH]

Train

Here I will tell you how to fine-tune with your own datasets. With only 100-200 images and less than one hour, you can train your own StyleGAN2. The only thing you need to do is to copy an agile_transfer config, like this one. Then modify the imgs_root with your actual data root, choose one of the two commands below to train your own model.

# For distributed training
bash tools/dist_train.sh ${CONFIG_FILE} ${GPUS_NUMBER} \
    --work-dir ./work_dirs/experiments/experiments_name \
    [optional arguments]
# For slurm training
bash tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG} ${WORK_DIR} \
    [optional arguments]

Training Details

In this part, I will explain some training details, including ADA setting, layer freeze, and losses.

ADA Setting

To use ADA in your discriminator, you can use ADAStyleGAN2Discriminator as your discriminator, and adjust ADAAug setting as follows:

model = dict(
    discriminator=dict(
                 type='ADAStyleGAN2Discriminator',
                 data_aug=dict(type='ADAAug',
                 aug_pipeline=aug_kwargs, # This and below arguments can be set by yourself.
                 update_interval=4,
                 augment_initial_p=0.,
                 ada_target=0.6,
                 ada_kimg=500,
                 use_slow_aug=False)))

Layer Freeze Setting

FreezeD can be used for small data fine-tuning.

FreezeG can be used for pseudo translation.

model = dict(
  freezeD=5, # set to -1 if not need
  freezeG=4 # set to -1 if not need
  )

Losses Setting

In AgileGAN, to preserve the recognizable identity of the generated image, they introduce a similarity loss at the perceptual level. You can adjust the lpips_lambda as follows:

model = dict(lpips_lambda=0.8)

Generally speaking, the larger lpips_lambda is, the better the recognizable identity can be kept.

Datasets Link

To make it easier for you to train your own models, here are some links to publicly available datasets.

Dataset Links
MetFaces
AFHQ
Toonify
photo2cartoon
selfie2anime
face2comics v2
High-Resolution Anime Face
Bitmoji

Applications

We also provide LayerSwap and DNI apps for the trade-off between the structure of the original image and the stylization degree. To this end, you can adjust some parameters to get your desired result.

LayerSwap

When Layer Swapping is applied, the generated images have a higher similarity to the source image than AgileGAN's results.

From Left to Right: Input, Layer-Swap with L = 4, 3, 2, xxx Output

Run this command line to perform layer swapping:

python apps/layerSwap.py source_path modelA modelB \
      [--swap-layer SWAP_LAYER] [--device DEVICE] [--save-path SAVE_PATH]

Here, modelA is set to an PSPEncoderDecoder(config starts with agile_encoder) with FFHQ-StyleGAN2 as the decoder, modelB is set to an PSPEncoderDecoder(config starts with agile_encoder) with desired style generator as the decoder. Generally, the deeper you set swap-layer, the better structure of the original image will be kept.

We also provide a blending script to create and save the mixed weights.

python modelA modelB [--swap-layer SWAP_LAYER] [--show-input SHOW_INPUT] [--device DEVICE] [--save-path SAVE_PATH]

Here, modelA is the base model, where only the deep layers of its decoder will be replaced with modelB's counterpart.

DNI

Deep Network Interpolation between L4 and AgileGAN output

For more precise stylization control, you can try DNI with following commands:

python apps/dni.py source_path modelA modelB [--intervals INTERVALS] [--device DEVICE] [--save-folder SAVE_FOLDER]

Here, modelA and modelB are supposed to be PSPEncoderDecoder(configs start with agile_encoder) with decoders of different stylization degrees. INTERVALS is supposed to be the interpolation numbers.

You can also try applications in MMGEN, like interpolation and SeFA.

Interpolation


Indeed, we have provided an application script to users. You can use apps/interpolate_sample.py with the following commands for unconditional models’ interpolation:

python apps/interpolate_sample.py \
    ${CONFIG_FILE} \
    ${CHECKPOINT} \
    [--show-mode ${SHOW_MODE}] \
    [--endpoint ${ENDPOINT}] \
    [--interval ${INTERVAL}] \
    [--space ${SPACE}] \
    [--samples-path ${SAMPLES_PATH}] \
    [--batch-size ${BATCH_SIZE}] \

For more details, you can read related Docs.

Galary

Toonify





Oil





Cartoon





Comic





Bitmoji





Notions and TODOs

  • For encoder, I experimented with vae-encoder but found no significant improvement for inversion. I follow the "encoding into z plus space" way as the author does. I will release the vae-encoder version later, but I only offer a vanilla encoder this time.
  • For generator, I released vanilla stylegan2-generator, and attribute-aware generator will be released in next version.
  • For training settings, the parameters have slight difference from the paper. And I also tried ADA, freezeD and other methods not mentioned in paper.
  • More styles will be available in the next version.
  • More applications will be available in the next version.
  • We are also considering a web-side application.
  • Further code clean jobs.

Acknowledgments

Codes reference:

Display photos from: https://unsplash.com/t/people

Web demo powered by: https://gradio.app/

License

This project is released under the Apache 2.0 license. Some implementation in MMGEN-FaceStylor are with other licenses instead of Apache2.0. Please refer to LICENSES.md for the careful check, if you are using our code for commercial matters.

Owner
OpenMMLab
OpenMMLab
These are the materials for the paper "Few-Shot Out-of-Domain Transfer Learning of Natural Language Explanations"

Few-shot-NLEs These are the materials for the paper "Few-Shot Out-of-Domain Transfer Learning of Natural Language Explanations". You can find the smal

Yordan Yordanov 0 Oct 21, 2022
PyTorch implementation of deep GRAph Contrastive rEpresentation learning (GRACE).

GRACE The official PyTorch implementation of deep GRAph Contrastive rEpresentation learning (GRACE). For a thorough resource collection of self-superv

Big Data and Multi-modal Computing Group, CRIPAC 186 Dec 27, 2022
[CVPR'21] Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration

Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration This repository contains the implementation of our paper Locally Aware Pi

sfwang 70 Dec 19, 2022
Official implementation of the paper ``Unifying Nonlocal Blocks for Neural Networks'' (ICCV'21)

Spectral Nonlocal Block Overview Official implementation of the paper: Unifying Nonlocal Blocks for Neural Networks (ICCV'21) Spectral View of Nonloca

91 Dec 14, 2022
SPEAR: Semi suPErvised dAta progRamming

Semi-Supervised Data Programming for Data Efficient Machine Learning SPEAR is a library for data programming with semi-supervision. The package implem

decile-team 91 Dec 06, 2022
SpiroMask: Measuring Lung Function Using Consumer-Grade Masks

SpiroMask: Measuring Lung Function Using Consumer-Grade Masks Anonymised repository for paper submitted for peer review at ACM HEALTH (October 2021).

0 May 10, 2022
Measuring Coding Challenge Competence With APPS

Measuring Coding Challenge Competence With APPS This is the repository for Measuring Coding Challenge Competence With APPS by Dan Hendrycks*, Steven B

Dan Hendrycks 218 Dec 27, 2022
Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.

Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.

Nikolas Petrou 1 Jan 13, 2022
Rendering color and depth images for ShapeNet models.

Color & Depth Renderer for ShapeNet This library includes the tools for rendering multi-view color and depth images of ShapeNet models. Physically bas

Yinyu Nie 41 Dec 19, 2022
[ICCV 2021] Relaxed Transformer Decoders for Direct Action Proposal Generation

RTD-Net (ICCV 2021) This repo holds the codes of paper: "Relaxed Transformer Decoders for Direct Action Proposal Generation", accepted in ICCV 2021. N

Multimedia Computing Group, Nanjing University 80 Nov 30, 2022
Official Repo of my work for SREC Nandyal Machine Learning Bootcamp

About the Bootcamp A 3-day Machine Learning Bootcamp organised by Department of Electronics and Communication Engineering, Santhiram Engineering Colle

MS 1 Nov 29, 2021
A library for answering questions using data you cannot see

A library for computing on data you do not own and cannot see PySyft is a Python library for secure and private Deep Learning. PySyft decouples privat

OpenMined 8.5k Jan 02, 2023
TensorFlow Implementation of Unsupervised Cross-Domain Image Generation

Domain Transfer Network (DTN) TensorFlow implementation of Unsupervised Cross-Domain Image Generation. Requirements Python 2.7 TensorFlow 0.12 Pickle

Yunjey Choi 865 Nov 17, 2022
Learning kernels to maximize the power of MMD tests

Code for the paper "Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy" (arXiv:1611.04488; published at ICLR 2017), by Douga

Danica J. Sutherland 201 Dec 17, 2022
This repository is maintained for the scientific paper tittled " Study of keyword extraction techniques for Electric Double Layer Capacitor domain using text similarity indexes: An experimental analysis "

kwd-extraction-study This repository is maintained for the scientific paper tittled " Study of keyword extraction techniques for Electric Double Layer

ping 543f 1 Dec 05, 2022
pytorch, hand(object) detect ,yolo v5,手检测

YOLO V5 物体检测,包括手部检测。 项目介绍 手部检测 手部检测示例如下 : 视频示例: 项目配置 作者开发环境: Python 3.7 PyTorch = 1.5.1 数据集 手部检测数据集 该项目数据集采用 TV-Hand 和 COCO-Hand (COCO-Hand-Big 部分) 进

Eric.Lee 11 Dec 20, 2022
'A C2C E-COMMERCE TRUST MODEL BASED ON REPUTATION' Python implementation

Project description A library providing functionalities to calculate reputation and degree of trust on C2C ecommerce platforms. The work is fully base

Davide Bigotti 2 Dec 14, 2022
Customer-Transaction-Analysis - This analysis is based on a synthesised transaction dataset containing 3 months worth of transactions for 100 hypothetical customers.

Customer-Transaction-Analysis - This analysis is based on a synthesised transaction dataset containing 3 months worth of transactions for 100 hypothetical customers. It contains purchases, recurring

Ayodeji Yekeen 1 Jan 01, 2022
PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集,包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。 人机交互 主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

185 Dec 26, 2022
Code and model benchmarks for "SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology"

NeurIPS 2020 SEVIR Code for paper: SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology Requirement

USAF - MIT Artificial Intelligence Accelerator 46 Dec 15, 2022