StyleGAN-Human: A Data-Centric Odyssey of Human Generation

Abstract: Unconditional human image generation is an important task in vision and graphics, which enables various applications in the creative industry. Existing studies in this field mainly focus on "network engineering" such as designing new components and objective functions. This work takes a data-centric perspective and investigates multiple critical aspects in "data engineering", which we believe would complement the current practice. To facilitate a comprehensive study, we collect and annotate a large-scale human image dataset with over 230K samples capturing diverse poses and textures. Equipped with this large dataset, we rigorously investigate three essential factors in data engineering for StyleGAN-based human generation, namely data size, data distribution, and data alignment. Extensive experiments reveal several valuable observations w.r.t. these aspects: 1) Large-scale data, more than 40K images, are needed to train a high-fidelity unconditional human generation model with vanilla StyleGAN. 2) A balanced training set helps improve the generation quality with rare face poses compared to the long-tailed counterpart, whereas simply balancing the clothing texture distribution does not effectively bring an improvement. 3) Human GAN models with body centers for alignment outperform models trained using face centers or pelvis points as alignment anchors. In addition, a model zoo and human editing applications are demonstrated to facilitate future research in the community.
Keyword: Human Image Generation, Data-Centric, StyleGAN

Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Chen Qian, Chen Change Loy, Wayne Wu, and Ziwei Liu
[Demo Video] | [Project Page] | [Paper]

Updates

[26/04/2022] Technical report released!
[22/04/2022] Technical report will be released before May.
[21/04/2022] The codebase and project page are created.

Model Zoo

Structure	1024x512	512x256
StyleGAN1	stylegan_human_v1_1024.pkl	to be released
StyleGAN2	stylegan_human_v2_1024.pkl	stylegan_human_v2_512.pkl
StyleGAN3	to be released	stylegan_human_v3_512.pkl

Web Demo

Integrated into Huggingface Spaces 🤗 using Gradio. Try out the Web Demo for generation: and interpolation

We prepare a Colab demo to allow you to synthesize images with the provided models, as well as visualize the performance of style-mixing, interpolation, and attributes editing. The notebook will guide you to install the necessary environment and download pretrained models. The output images can be found in ./StyleGAN-Human/outputs/. Hope you enjoy!

Usage

System requirements

The original code bases are stylegan (tensorflow), stylegan2-ada (pytorch), stylegan3 (pytorch), released by NVidia
We tested in Python 3.8.5 and PyTorch 1.9.1 with CUDA 11.1. (See https://pytorch.org for PyTorch install instructions.)

Installation

To work with this project on your own machine, you need to install the environmnet as follows:

conda env create -f environment.yml
conda activate stylehuman
# [Optional: tensorflow 1.x is required for StyleGAN1. ]
pip install nvidia-pyindex
pip install nvidia-tensorflow[horovod]
pip install nvidia-tensorboard==1.15

Extra notes:

In case having some conflicts when calling CUDA version, please try to empty the LD_LIBRARY_PATH. For example:

LD_LIBRARY_PATH=; python generate.py --outdir=out/stylegan_human_v2_1024 --trunc=1 --seeds=1,3,5,7 
--network=pretrained_models/stylegan_human_v2_1024.pkl --version 2

We found the following troubleshooting links might be helpful: 1., 2.

Pretrained models

Please put the downloaded pretrained models from above link under the folder 'pretrained_models'.

Generate full-body human images using our pretrained model

# Generate human full-body images without truncation
python generate.py --outdir=outputs/generate/stylegan_human_v2_1024 --trunc=1 --seeds=1,3,5,7 --network=pretrained_models/stylegan_human_v2_1024.pkl --version 2

# Generate human full-body images with truncation 
python generate.py --outdir=outputs/generate/stylegan_human_v2_1024 --trunc=0.8 --seeds=0-10 --network=pretrained_models/stylegan_human_v2_1024.pkl --version 2

# Generate human full-body images using stylegan V1
python generate.py --outdir=outputs/generate/stylegan_human_v1_1024 --network=pretrained_models/stylegan_human_v1_1024.pkl --version 1 --seeds=1,3,5

# Generate human full-body images using stylegan V3
python generate.py --outdir=outputs/generate/stylegan_human_v3_512 --network=pretrained_models/stylegan_human_v3_512.pkl --version 3 --seeds=1,3,5

Note: The following demos are generated based on models related to StyleGAN V2 (stylegan_human_v2_512.pkl and stylegan_human_v2_1024.pkl). If you want to see results for V1 or V3, you need to change the loading method of the corresponding models.

Interpolation

python interpolation.py --network=pretrained_models/stylegan_human_v2_1024.pkl  --seeds=85,100 --outdir=outputs/inter_gifs

Style-mixing image using stylegan2

python style_mixing.py --network=pretrained_models/stylegan_human_v2_1024.pkl --rows=85,100,75,458,1500 \\
    --cols=55,821,1789,293 --styles=0-3 --outdir=outputs/stylemixing

Style-mixing video using stylegan2

python stylemixing_video.py --network=pretrained_models/stylegan_human_v2_1024.pkl --row-seed=3859 \\
    --col-seeds=3098,31759,3791 --col-styles=8-12 --trunc=0.8 --outdir=outputs/stylemixing_video

Editing with InterfaceGAN, StyleSpace, and Sefa

python edit.py --network pretrained_models/stylegan_human_v2_1024.pkl --attr_name upper_length \\
    --seeds 61531,61570,61571,61610 --outdir outputs/edit_results

Note:

''upper_length'' and ''bottom_length'' of ''attr_name'' are available for demo.
Layers to control and editing strength are set in edit/edit_config.py.

Demo for InsetGAN

We implement a quick demo using the key idea from InsetGAN: combining the face generated by FFHQ with the human-body generated by our pretrained model, optimizing both face and body latent codes to get a coherent full-body image. Before running the script, you need to download the FFHQ face model, or you can use your own face model, as well as pretrained face landmark and pretrained CNN face detection model for dlib

python insetgan.py --body_network=pretrained_models/stylegan_human_v2_1024.pkl --face_network=pretrained_models/ffhq.pkl \\
    --body_seed=82 --face_seed=43  --trunc=0.6 --outdir=outputs/insetgan/ --video 1

Results

Editing

InsetGAN re-implementation

For more demo, please visit our web page .

TODO List

Release 1024x512 version of StyleGAN-Human based on StyleGAN3
Release 512x256 version of StyleGAN-Human based on StyleGAN1
Extension of downstream application (InsetGAN): Add face inversion interface to support fusing user face image and stylegen-human body image
Add Inversion Script into the provided editing pipeline
Release Dataset

Citation

If you find this work useful for your research, please consider citing our paper:

@article{fu2022styleganhuman,
      title={StyleGAN-Human: A Data-Centric Odyssey of Human Generation}, 
      author={Fu, Jianglin and Li, Shikai and Jiang, Yuming and Lin, Kwan-Yee and Qian, Chen and Loy, Chen-Change and Wu, Wayne and Liu, Ziwei},
      journal   = {arXiv preprint},
      volume    = {arXiv:2204.11823},
      year    = {2022}

Acknowlegement

Part of the code is borrowed from stylegan (tensorflow), stylegan2-ada (pytorch), stylegan3 (pytorch).

StyleGAN-Human: A Data-Centric Odyssey of Human Generation

Related tags

Overview

StyleGAN-Human: A Data-Centric Odyssey of Human Generation

Updates

Model Zoo

Web Demo

Usage

System requirements

Installation

Pretrained models

Generate full-body human images using our pretrained model

Note: The following demos are generated based on models related to StyleGAN V2 (stylegan_human_v2_512.pkl and stylegan_human_v2_1024.pkl). If you want to see results for V1 or V3, you need to change the loading method of the corresponding models.

Interpolation

Style-mixing image using stylegan2

Style-mixing video using stylegan2

Editing with InterfaceGAN, StyleSpace, and Sefa

Demo for InsetGAN

Results

Editing

InsetGAN re-implementation

For more demo, please visit our web page .

TODO List

Citation

Acknowlegement

Owner

stylegan-human

[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

Hi Guys, here I am providing examples, which will help you in Lerarning Python

Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Embeddinghub is a database built for machine learning embeddings.

Code for the CIKM 2019 paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting".

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

Pytorch implementation of the AAAI 2022 paper "Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification"

Graph neural network message passing reframed as a Transformer with local attention

This repository consists of Blender python scripts and corresponding assets to generate variants of the CANDLE dataset

End-to-End Object Detection with Fully Convolutional Network

Simple PyTorch implementations of Badnets on MNIST and CIFAR10.

Voice assistant - Voice assistant with python

ONNX-GLPDepth - Python scripts for performing monocular depth estimation using the GLPDepth model in ONNX

Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Machine Learning University: Accelerated Computer Vision Class

This is the official pytorch implementation of Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation(TESKD)

List of papers, code and experiments using deep learning for time series forecasting

A package for "Procedural Content Generation via Reinforcement Learning" OpenAI Gym interface.

Official and maintained implementation of the paper "OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data" [BMVC 2021].

Real-time object detection on Android using the YOLO network with TensorFlow