A web porting for NVlabs' StyleGAN2, to facilitate exploring all kinds characteristic of StyleGAN networks

Overview

Explorer Demo

This project is a web porting for NVlabs' StyleGAN2, to facilitate exploring all kinds characteristic of StyleGAN networks.

Thanks for NVlabs' excellent work.

Features

Explorer

See how the result image response to changes of latent code & psi.

Projector Demo 1

Projector

Test the projection from image to latent code. Left is target image, right is result from generator model. Total step count and yielding interval can be configured in this page. And another hyperparameter regularize noise weight can be configured statically, see Environment Configurations.

You can save the projection result in a zip package, and this page can accept projector zip file dropping, so this feature enable you to share your projector result to others.

Projector Demo 1 Projector Demo 2

W latent space extension

We added an env switch of UNIFORM_LATENTS to denote using uniform or separated W latent code when projecting image. This is the results comparison (center is the target):

W latents result   ⇨   target image   ⇦   W+ latents result

Projection animation exporting

You can also export image projection result sequence as a gif animation:

  ⇦  

Face image pose alignment

Dataset of FFHQ's generation has a crop process to align face area.see paper, appendix C. So the output distribution of StyleGAN model learned on FFHQ has a strong prior tendency on features position. We observed that many face images projection suffers semantic mistakes, e.g. erasing original eyes and transforming eyebrow into eyes during projection fitting (however you can get a similar face at last, but it may yield freak results when you manipulate the latent code). Finally we figured out that mainly caused by unalignment with training dataset prior distribution.

Then we import the face-api to measure and align human face images as below:

- 🙂 - ✄ →

Gratefulness for the authorization by @芈砾 to use his nice opus.

Click the button [ 🙂 ] after target image loaded, if the face detection succeed, you will get the face landmark and proposed crop box. The detection result may be not very accurate, now you can adjust 3 anchor marks manually to align left eye (red), right eye (green) and mouth (blue). Then click button [✄] to apply the crop.

Merger

Once you get some latent codes by projector or turning, you can test to mix features by interpolating latent values on every W layer. This is a demo.

Merger Demo

The pair of top-left images are the source to merge, press Ctrl+V in the hash box below either image to paste input latent code via clipboard, and Ctrl+C on the right blank area to copy result latent code.

Mapping Network Research

mapping plot

I attempt to explore the StyleGAN mapping network high-dimensional terrain aspect, read this article for details.

Usage

Run the web server:

python ./http_server.py

If this works, open http://localhost:8186 in your browser.

To ensure it working, please read the following requirements before do this.

Requirements

Python

Install requirement libraries with pip, reference to requirements.txt.

Network Files

Before run the web server, StyleGAN2 pre-trained network files must be placed in local disk (recommended the folder models). You can download network files following to StyleGAN2's code.

For memory reason, only one generator model can be loaded when running the web server. Network file paths can be configured by env variables. Create a file named .env.local under project root to configure chosen model and network file paths. Network file name/paths are configured in key-value style, e.g.:

MODEL_NAME=ffhq		# ffhq is the default value, so this line can be ignored 

MODEL_PATH_ffhq=./models/stylegan2-ffhq-config-f.pkl
MODEL_PATH_cat=./models/stylegan2-cat-config-f.pkl
# And so on...

Alternately, you can also choose generator model name by start command argument, e.g.:

python ./http_server.py cat

Or, for nodejs developer:

yarn start cat

Besides generators, the network LPIPS is required when run image projector, the default local path is ./models/vgg16_zhang_perceptual.pkl, download link. You can also change local path by env variable MODEL_PATH_LPIPS.

For Windows

According to StyleGAN2 README.md, here are our additional help instructions:

  • MSVC

    NOTE: Visual Studio 2019 Community Edition seems not compatible with CUDA 10.0, Visual Studio 2017 works.

    Append the actual msvc binary directory (find in your own disk) into dnnlib/tflib/custom_ops.py, the array of compiler_bindir_search_path. For example:

     -	'C:/Program Files (x86)/Microsoft Visual Studio 14.0/vc/bin',
     +	'C:/Program Files (x86)/Microsoft Visual Studio/2017/BuildTools/VC/Tools/MSVC/14.16.27023/bin/Hostx64/x64',
  • NVCC

    To test if nvcc is configured properly, dowload test_nvcc.cu in StyleGAN2 project. And the test command should specify binary path:

     nvcc test_nvcc.cu -o test_nvcc -run -ccbin "C:\Program Files (x86)\Microsoft VisualStudio\2017\BuildTools\VC\Tools\MSVC\14.16.27023\bin\Hostx64\x64"

    Actual path to different msvc edition may have difference in detail. If this succeed, it will build a file named test_nvcc.exe.

  • Tips for tensorflow 1.15

    Tensorflow 1.15 can work under Windows, but NVCC compiling may encounter C++ including path problem. Here is an easy workaround: make a symbolic link in python installation directory Python36\Lib\site-packages\tensorflow_core:

     mklink /J tensorflow tensorflow_core
  • Tips for tensorflow 2.x

    Tensorflow 2.0+ can work now! I have solved the compatibility issues with TF2 already, including some modification of code bundled in pickle. Except one problem on Windows, if you encountered this:

    C:/Users/xxx/AppData/Local/Programs/Python/Python36/lib/site-packages/tensorflow/include\unsupported/Eigen/CXX11/Tensor(74): fatal error C1083: Cannot open include file: 'unistd.h': No such file or directory

    Just open this file and comment out this line simply:

    #include

    It seems a bug of tensorflow, and I have committed an issue for them.

  • cudafe++ issue

    If you encountered python console error like:

     nvcc error : 'cudafe++' died with status 0xC0000005 (ACCESS_VIOLATION)
    

    That may caused by a bug from CUDA 10.0, you can fix this issue by replacing your cudafe++.exe file in CUDA program bin directory by the same name file from CUDA 10.1 or higher version. And you are welcome to download my backup to avoid install a whole new version CUDA.

Environment Configurations

To manage environment variables conveniently, create a configuration file named .env.local. All avaiable env list:

Key Description Default Value
HTTP_HOST Web server host. 127.0.0.1
HTTP_PORT Web server port. 8186
MODEL_NAME Name for the generator model to load, this can be overwrite by the first argument of start script. ffhq
MODEL_PATH_LPIPS File path for LPIPS model. ./models/vgg16_zhang_perceptual.pkl
MODEL_PATH_* Generator network file path dictionary. See examples.
REGULARIZE_NOISE_WEIGHT Projector training hyperparameter. Float. 1e5
INITIAL_NOISE_FACTOR Projector training hyperparameter. Float. 0.05
EUCLIDEAN_DIST_WEIGHT Projector training hyperparameter. Float. 1
REGULARIZE_MAGNITUDE_WEIGHT Projector training hyperparameter. Float. 0
UNIFORM_LATENTS Use uniform latents for all feature layers (consistent with origin StyleGAN2 paper). Boolean, 0 or 1 0

 

 

 

A Bonus :)

Owner
K.L.
K.L.
Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

mxin262 183 Jan 03, 2023
RARA: Zero-shot Sim2Real Visual Navigation with Following Foreground Cues

RARA: Zero-shot Sim2Real Visual Navigation with Following Foreground Cues FGBG (foreground-background) pytorch package for defining and training model

Klaas Kelchtermans 1 Jun 02, 2022
Gated-Shape CNN for Semantic Segmentation (ICCV 2019)

GSCNN This is the official code for: Gated-SCNN: Gated Shape CNNs for Semantic Segmentation Towaki Takikawa, David Acuna, Varun Jampani, Sanja Fidler

859 Dec 26, 2022
A repository with exploration into using transformers to predict DNA ↔ transcription factor binding

Transcription Factor binding predictions with Attention and Transformers A repository with exploration into using transformers to predict DNA ↔ transc

Phil Wang 62 Dec 20, 2022
A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.

Object Pose Estimation Demo This tutorial will go through the steps necessary to perform pose estimation with a UR3 robotic arm in Unity. You’ll gain

Unity Technologies 187 Dec 24, 2022
The InterScript dataset contains interactive user feedback on scripts generated by a T5-XXL model.

Interscript The Interscript dataset contains interactive user feedback on a T5-11B model generated scripts. Dataset data.json contains the data in an

AI2 8 Dec 01, 2022
This repository allows the user to automatically scale a 3D model/mesh/point cloud on Agisoft Metashape

Metashape-Utils This repository allows the user to automatically scale a 3D model/mesh/point cloud on Agisoft Metashape, given a set of 2D coordinates

INSCRIBE 4 Nov 07, 2022
SMCA replication There are no extra compiled components in SMCA DETR and package dependencies are minimal

Usage There are no extra compiled components in SMCA DETR and package dependencies are minimal, so the code is very simple to use. We provide instruct

22 May 06, 2022
Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs

Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs In this work, we propose an algorithm DP-SCAFFOLD(-warm), whic

19 Nov 10, 2022
UT-Sarulab MOS prediction system using SSL models

UTMOS: UTokyo-SaruLab MOS Prediction System Official implementation of "UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022" submitted to INTERSP

sarulab-speech 58 Nov 22, 2022
load .txt to train YOLOX, same as Yolo others

YOLOX train your data you need generate data.txt like follow format (per line- one image). prepare one data.txt like this: img_path1 x1,y1,x2,y2,clas

LiMingf 18 Aug 18, 2022
Repositorio oficial del curso IIC2233 Programación Avanzada 🚀✨

IIC2233 - Programación Avanzada Evaluación Las evaluaciones serán efectuadas por medio de actividades prácticas en clases y tareas. Se calculará la no

IIC2233 @ UC 47 Sep 06, 2022
Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020

XDVioDet Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020. The proj

peng 64 Dec 12, 2022
Official repository for ABC-GAN

ABC-GAN The work represented in this repository is the result of a 14 week semesterthesis on photo-realistic image generation using generative adversa

IgorSusmelj 10 Jun 23, 2022
A collection of models for image<->text generation in ACM MM 2021.

Bi-directional Image and Text Generation UMT-BITG (image & text generator) Unifying Multimodal Transformer for Bi-directional Image and Text Generatio

Multimedia Research 63 Oct 30, 2022
HTSeq is a Python library to facilitate processing and analysis of data from high-throughput sequencing (HTS) experiments.

HTSeq DEVS: https://github.com/htseq/htseq DOCS: https://htseq.readthedocs.io A Python library to facilitate programmatic analysis of data from high-t

HTSeq 57 Dec 20, 2022
(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

BRNet Introduction This is a release of the code of our paper Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds,

86 Oct 05, 2022
Hierarchical User Intent Graph Network for Multimedia Recommendation

Hierarchical User Intent Graph Network for Multimedia Recommendation This is our Pytorch implementation for the paper: Hierarchical User Intent Graph

6 Jan 05, 2023
A PyTorch Lightning Callback for pushing models to the Hugging Face Hub 🤗⚡️

hf-hub-lightning A callback for pushing lightning models to the Hugging Face Hub. Note: I made this package for myself, mostly...if folks seem to be i

Nathan Raw 27 Dec 14, 2022