[CVPR 2021] Generative Hierarchical Features from Synthesizing Images

Last update: Dec 09, 2022

Related tags

Overview

GH-Feat - Generative Hierarchical Features from Synthesizing Images

Figure: Training framework of GH-Feat.

Generative Hierarchical Features from Synthesizing Images
Yinghao Xu*, Yujun Shen*, Jiapeng Zhu, Ceyuan Yang, Bolei Zhou
Computer Vision and Pattern Recognition (CVPR), 2021 (Oral)

[Paper] [Project Page]

In this work, we show that well-trained GAN generators can be used as training supervision to learn hierarchical visual features. We call this feature as Generative Hierarchical Feature (GH-Feat). Properly learned from a novel hierarchical encoder, GH-Feat is able to facilitate both discriminative and generative visual tasks, including face verification, landmark detection, layout prediction, transfer learning, style mixing, image editing, etc.

Usage

Environment

Before running the code, please setup the environment with

conda env create -f environment.yml
conda activate ghfeat

Testing

The following script can be used to extract GH-Feat from a list of images.

python extract_ghfeat.py ${ENCODER_PATH} ${IMAGE_LIST} -o ${OUTPUT_DIR}

We provide some well-learned encoders for inference.

Path	Description
face_256x256	GH-Feat encoder trained on FF-HQ dataset.
tower_256x256	GH-Feat encoder trained on LSUN Tower dataset.
bedroom_256x256	GH-Feat encoder trained on LSUN Bedroom dataset.

Training

Given a well-trained StyleGAN generator, our hierarchical encoder is trained with the objective of image reconstruction.

python train_ghfeat.py \
       ${TRAIN_DATA_PATH} \
       ${VAL_DATA_PATH} \
       ${GENERATOR_PATH} \
       --num_gpus ${NUM_GPUS}

Here, the train_data and val_data can be created by this script. Note that, according to the official StyleGAN repo, the dataset is prepared in the multi-scale manner, but our encoder training only requires the data at the largest resolution. Hence, please specify the path to the tfrecords with the target resolution instead of the directory of all the tfrecords files.

Users can also train the encoder with slurm:

srun.sh ${PARTITION} ${NUM_GPUS} \
        python train_ghfeat.py \
               ${TRAIN_DATA_PATH} \
               ${VAL_DATA_PATH} \
               ${GENERATOR_PATH} \
               --num_gpus ${NUM_GPUS}

We provide some pre-trained generators as follows.

Path	Description
face_256x256	StyleGAN trained on FFHQ dataset.
tower_256x256	StyleGAN trained on LSUN Tower dataset.
bedroom_256x256	StyleGAN trained on LSUN Bedroom dataset.

Codebase Description

Most codes are directly borrowed from StyleGAN repo.
Structure of the proposed hierarchical encoder: training/networks_ghfeat.py
Training loop of the encoder: training/training_loop_ghfeat.py
To feed GH-Feat produced by the encoder to the generator as layer-wise style codes, we slightly modify training/networks_stylegan.py. (See Line 263 and Line 477).
Main script for encoder training: train_ghfeat.py.
Script for extracting GH-Feat from images: extract_ghfeat.py.
VGG model for computing perceptual loss: perceptual_model.py.

Results

We show some results achieved by GH-Feat on a variety of downstream visual tasks.

Discriminative Tasks

Indoor scene layout prediction

Facial landmark detection

Face verification (face reconstruction)

Generative Tasks

Image harmonization

Global editing

Local Editing

Multi-level style mixing

BibTeX

@inproceedings{xu2021generative,
  title     = {Generative Hierarchical Features from Synthesizing Images},
  author    = {Xu, Yinghao and Shen, Yujun and Zhu, Jiapeng and Yang, Ceyuan and Zhou, Bolei},
  booktitle = {CVPR},
  year      = {2021}
}

[CVPR 2021] Generative Hierarchical Features from Synthesizing Images

Related tags

Overview

GH-Feat - Generative Hierarchical Features from Synthesizing Images

Usage

Environment

Testing

Training

Codebase Description

Results

Discriminative Tasks

Generative Tasks

BibTeX

Owner

GenForce: May Generative Force Be with You

Official PyTorch Implementation of GAN-Supervised Dense Visual Alignment

Code for a seq2seq architecture with Bahdanau attention designed to map stereotactic EEG data from human brains to spectrograms, using the PyTorch Lightning.

This repo is to be freely used by ML devs to check the GAN performances without coding from scratch.

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

My personal Home Assistant configuration.

Human Dynamics from Monocular Video with Dynamic Camera Movements

Using a Seq2Seq RNN architecture via TensorFlow to predict future Bitcoin prices

OpenMMLab Semantic Segmentation Toolbox and Benchmark.

RIFE - Real-Time Intermediate Flow Estimation for Video Frame Interpolation

🍅🍅🍅YOLOv5-Lite: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 1.7M (int8) and 3.3M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size is 320×320~

DCGAN-tensorflow - A tensorflow implementation of Deep Convolutional Generative Adversarial Networks

Official code for On Path Integration of Grid Cells: Group Representation and Isotropic Scaling (NeurIPS 2021)

mPose3D, a mmWave-based 3D human pose estimation model.

Mmdet benchmark with python

AI4Good project for detecting waste in the environment

Parris, the automated infrastructure setup tool for machine learning algorithms.

Planner_backend - Academic planner application designed for students and counselors.

StarGAN2 for practice

CarND-LaneLines-P1 - Lane Finding Project for Self-Driving Car ND

[ICML 2021] “ Self-Damaging Contrastive Learning”, Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang