Expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

Last update: Nov 17, 2022

Related tags

Overview

Oh-My-Face

This project is based on StyleCLIP, RIFE, and encoder4editing, which aims to expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

StyleCLIP is an excellent algorithm that acts on the latent code of StyleGAN2 to edit images guided by texts. Global Direction uses models such as e4e to convert images into latent codes and then further editing. However, this conversion causes information loss of the original image and dissimilarities.

Thus, we use the optical flow model to detect the change in different regions between the StyleCLIP generated image and the original image, sample more from the original in slightly-edited areas, then use frame interpolation to perform weighted fusion, which is simple yet efficient.

We will further release weights for cat face editing, containing cat facial landmark recognition from pycatfd and e4e-cat model. e4e-cat is trained via afhq-cat dataset and StyleGAN2-cat weights. StyleGAN2-pytorch/convert_weights.py is used to convert the tensorflow weights.

Usage

Prerequisites

NVIDIA GPU + CUDA11.0 CuDNN
Python 3.6

Installation

Clone this repository

git clone https://github.com/P2Oileen/oh-my-face

Dependencies

To install all the dependencies, please run the following commands.

wget https://developer.nvidia.com/compute/cuda/10.0/Prod/local_installers/cuda-repo-ubuntu1604-10-0-local-10.0.130-410.48_1.0-1_amd64 -O cuda-repo-ubuntu1604-10-0-local-10.0.130-410.48_1.0-1_amd64.deb
dpkg -i cuda-repo-ubuntu1604-10-0-local-10.0.130-410.48_1.0-1_amd64.deb
apt-key add /var/cuda-repo-10-0-local/7fa2af80.pub
apt-get update
apt-get -y install gcc-7 g++-7
apt-get -y install cuda 

export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda

pip install tensorflow-gpu==1.15.2
pip install ftfy regex tqdm gdown
pip install git+https://github.com/openai/CLIP.git
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

wget https://github.com/ninja-build/ninja/releases/download/v1.8.2/ninja-linux.zip
sudo unzip ninja-linux.zip -d /usr/local/bin/
sudo update-alternatives --install /usr/bin/ninja ninja /usr/local/bin/ninja 1 --force

Download Weights Currently, We only provide weights for human face editing, PLEASE wait for further weights.

cd oh-my-face
wget https://drive.google.com/file/d/1efFoGShtZhcd6SCxOPu3AMbKZus478au/view?usp=sharing
tar -zxvf ffhq.tar.gz
mv ffhq src/
wget https://drive.google.com/file/d/1bXhWOnwCTTXTz7T7zJ1iXA717tyj-n3U/view?usp=sharing
tar -zxvf oh-my-face/weights-face.tar.gz
mv weights oh-my-face/src/

Edit image via oh-my-face

python3 run.py \
--input_dir='input.jpg' \ # Path to your input image
--output_dir='output.jpg' \ # Path to output directory
--option_beta=0.15 \ # Range from 0.08 to 0.3, corresponds to the disentanglement threshold
--option_alpha=4.1 \ # Range from -10.0 to 10.0, corresponds to the manipulation strength
--option_gamma=3 \ # Range from 1 to 10, corresponds to RIFE's sample strength
--neutral='face' \ # Origin description
--target='face with smile' \ # Target description

Expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

Related tags

Overview

Oh-My-Face

Usage

Prerequisites

Installation

Edit image via oh-my-face

Owner

AiLin Huang

Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

PyTorch Implementation of Exploring Explicit Domain Supervision for Latent Space Disentanglement in Unpaired Image-to-Image Translation.

MARE - Multi-Attribute Relation Extraction

Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Natural Intelligence is still a pretty good idea.

Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order of magnitude using coresets and data selection.

Code for unmixing audio signals in four different stems "drums, bass, vocals, others". The code is adapted from "Jukebox: A Generative Model for Music"

Code base for reproducing results of I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics. NeurIPS (2021)

CowHerd is a partially-observed reinforcement learning environment

Improving Transferability of Representations via Augmentation-Aware Self-Supervision

DeepCAD: A Deep Generative Network for Computer-Aided Design Models

Styleformer - Official Pytorch Implementation

Autonomous Ground Vehicle Navigation and Control Simulation Examples in Python

A toy compiler that can convert Python scripts to pickle bytecode 🥒

Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks

Framework for joint representation learning, evaluation through multimodal registration and comparison with image translation based approaches

Distributed Asynchronous Hyperparameter Optimization better than HyperOpt.

Detection of drones using their thermal signatures from thermal camera through YOLO-V3 based CNN with modifications to encapsulate drone motion