This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

Related tags

Deep Learningheadnerf
Overview

HeadNeRF: A Real-time NeRF-based Parametric Head Model

This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)". Authors: Yang Hong, Bo Peng, Haiyao Xiao, Ligang Liu and Juyong Zhang*.

| Project Page | Paper |

This code has been tested on ubuntu 20.04/18.04 and contains the following parts:

  1. An interactive GUI that allows users to utilize HeadNeRF to directly edit the generated images’ rendering pose and various semantic attributes.
  2. A fitting framework for obtaining the latent code embedding in HeadNeRF of a single image.

Requirements

  • python3

  • torch>=1.8.1

  • torchvision

  • imageio

  • kornia

  • numpy

  • opencv-python==4.3.0.36

  • pyqt5

  • tqdm

  • face-alignment

  • Pillow, plotly, matplotlib, scipy, scikit-image We recommend running the following commands to create an anaconda environment called "headnerf" and automatically install the above requirements.

    conda env create -f environment.yaml
    conda activate headnerf
  • Pytorch

    Please refer to pytorch for details.

  • Pytorch3d

    It is recommended to install pytorch3d from a local clone.

    git clone https://github.com/facebookresearch/pytorch3d.git
    cd pytorch3d && pip install -e . && cd ..

Note:

  • In order to run the code smoothly, a GPU with performance higher than 1080Ti is recommended.
  • This code can also be run on Windows 10 when the mentioned above requirements are satisfied.

Getting Started

Download ConfigModels.zip, TrainedModels.zip, and LatentCodeSamples.zip, then unzip them to the root dir of this project.

Other links: Google Drive, One Drive

The folder structure is as follows:

headnerf
├── ConfigModels
│   ├── faceparsing_model.pth
│   ├── nl3dmm_dict.pkl
│   └── nl3dmm_net_dict.pth
│
├── TrainedModels
│   ├── model_Reso32.pth
│   ├── model_Reso32HR.pth
│   └── model_Reso64.pth
│
└── LatentCodeSamples
    ├── model_Reso32
    │   ├── S001_E01_I01_P02.pth
    │   └── ...
    ├── model_Reso32HR
    │   ├── S001_E01_I01_P02.pth
    │   └── ...
    └── model_Reso64
        ├── S001_E01_I01_P02.pth
        └── ...

Note:

  • faceparsing_model.pth is from face-parsing.PyTorch, and we utilize it to help generate the head mask.

  • nl3dmm_dict.pkl and nl3dmm_net_dict.pth are from 3D face from X, and they are the parameters of 3DMM.

  • model_Reso32.pth, model_Reso32HR.pth and model_Reso64.pth are our pre-trained models, and their properties are as follows:

    Pre-trained Models Feature Map's Reso Result's Reso GPU 1080Ti GPU 3090
    model_Reso32 32 x 32 256 x 256 ~14fps ~40fps
    model_Reso32HR 32 x 32 512 x 512 ~13fps ~30fps
    model_Reso64 64 x 64 512 x 512 ~ 3fps ~10fps
  • LatentCodeSamples.zip contains some latent codes that correspond to some given images.

The Interactive GUI

#GUI, for editing the generated images’ rendering pose and various semantic attributes.
python MainGUI.py --model_path "TrainedModels/model_Reso64.pth"

Args:

  • model_path is the path of the specified pre-trained model.

An interactive interface like the first figure of this document will be generated after executing the above command.

The fitting framework

This part provides a framework for fitting a single image using HeadNeRF. Besides, some test images are provided in test_data/single_images dir. These images are from FFHQ dataset and do not participate in building HeadNeRF's models.

Data Preprocess

# generating head's mask.
python DataProcess/Gen_HeadMask.py --img_dir "test_data/single_images"

# generating 68-facial-landmarks by face-alignment, which is from 
# https://github.com/1adrianb/face-alignment
python DataProcess/Gen_Landmark.py --img_dir "test_data/single_images"

# generating the 3DMM parameters
python Fitting3DMM/FittingNL3DMM.py --img_size 512 \
                                    --intermediate_size 256  \
                                    --batch_size 9 \
                                    --img_dir "test_data/single_images"

The generated results will be saved to the --img_dir.

Fitting a Single Image

# Fitting a single image using HeadNeRF
python FittingSingleImage.py --model_path "TrainedModels/model_Reso32HR.pth" \
                             --img "test_data/single_images/img_000037.png" \
                             --mask "test_data/single_images/img_000037_mask.png" \
                             --para_3dmm "test_data/single_images/img_000037_nl3dmm.pkl" \
                             --save_root "test_data/fitting_res" \
                             --target_embedding "LatentCodeSamples/*/S025_E14_I01_P02.pth"

Args:

  • para_3dmm is the 3DMM parameter of the input image and is provided in advance to initialize the latent codes of the corresponding image.
  • target_embedding is a head's latent code embedding in HeadNeRF and is an optional input. If it is provided, we will perform linear interpolation on the fitting latent code embedding and the target latent code embedding, and the corresponding head images are generated using HeadNeRF.
  • save_root is the directory where the following results are saved.

Results:

  • The image that merges the input image and the fitting result.
  • The dynamic image generated by continuously changing the rendering pose of the fitting result.
  • The dynamic image generated by performing linear interpolation on the fitting latent code embedding and the target latent code embedding.
  • The latent codes (.pth file) of the fitting result.

Note:

  • Fitting a single image based on model_Reso32.pth requires more than ~5 GB GPU memory.
  • Fitting a single image based on model_Reso32HR.pth requires more than ~6 GB GPU memory.
  • Fitting a single image based on model_Reso64.pth requires more than ~13 GB GPU memory.

Citation

If you find our work useful in your research, please consider citing our paper:

@article{hong2021headnerf,
     author     = {Yang Hong and Bo Peng and Haiyao Xiao and Ligang Liu and Juyong Zhang},
     title      = {HeadNeRF: A Real-time NeRF-based Parametric Head Model},
     booktitle  = {{IEEE/CVF} Conference on Computer Vision and Pattern Recognition (CVPR)},
     year       = {2022}
  }

If you have questions, please contact [email protected].

Acknowledgments

License

Academic or non-profit organization noncommercial research use only.

PyTorch implementation of an end-to-end Handwritten Text Recognition (HTR) system based on attention encoder-decoder networks

AttentionHTR PyTorch implementation of an end-to-end Handwritten Text Recognition (HTR) system based on attention encoder-decoder networks. Scene Text

Dmitrijs Kass 31 Dec 22, 2022
Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

SEDE SEDE (Stack Exchange Data Explorer) is new dataset for Text-to-SQL tasks with more than 12,000 SQL queries and their natural language description

Rupert. 83 Nov 11, 2022
PyTorch implementation of EfficientNetV2

[NEW!] Check out our latest work involution accepted to CVPR'21 that introduces a new neural operator, other than convolution and self-attention. PyTo

Duo Li 375 Jan 03, 2023
Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

30 Days Of Machine Learning Using Pytorch Objective of the repository is to learn and build machine learning models using Pytorch. List of Algorithms

Mayur 119 Nov 24, 2022
P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

P-tuning v2 P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks An optimized prompt tuning strategy for sma

THUDM 540 Dec 30, 2022
Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness

Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness Code for Paper "Imbalanced Gradients: A Subtle Cause of Overestimated Adv

Hanxun Huang 11 Nov 30, 2022
This repository provides data for the VAW dataset as described in the CVPR 2021 paper titled "Learning to Predict Visual Attributes in the Wild"

Visual Attributes in the Wild (VAW) This repository provides data for the VAW dataset as described in the CVPR 2021 Paper: Learning to Predict Visual

Adobe Research 36 Dec 30, 2022
Imaginaire - NVIDIA's Deep Imagination Team's PyTorch Library

Imaginaire Docs | License | Installation | Model Zoo Imaginaire is a pytorch library that contains optimized implementation of several image and video

NVIDIA Research Projects 3.6k Dec 29, 2022
A simple program for training and testing vit

Vit This is a simple program for training and testing vit. Key requirements: torch, torchvision and timm. Dataset I put 5 categories of the cub classi

xiezhenyu 2 Oct 11, 2022
Detectorch - detectron for PyTorch

Detectorch - detectron for PyTorch (Disclaimer: this is work in progress and does not feature all the functionalities of detectron. Currently only inf

Ignacio Rocco 558 Dec 23, 2022
[ICCV21] Self-Calibrating Neural Radiance Fields

Self-Calibrating Neural Radiance Fields, ICCV, 2021 Project Page | Paper | Video Author Information Yoonwoo Jeong [Google Scholar] Seokjun Ahn [Google

381 Dec 30, 2022
A Python implementation of active inference for Markov Decision Processes

A Python package for simulating Active Inference agents in Markov Decision Process environments. Please see our companion preprint on arxiv for an ove

235 Dec 21, 2022
UT-Sarulab MOS prediction system using SSL models

UTMOS: UTokyo-SaruLab MOS Prediction System Official implementation of "UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022" submitted to INTERSP

sarulab-speech 58 Nov 22, 2022
Jingju baseline - A baseline model of our project of Beijing opera script generation

Jingju Baseline It is a baseline of our project about Beijing opera script gener

midon 1 Jan 14, 2022
This is an official implementation for "Video Swin Transformers".

Video Swin Transformer By Ze Liu*, Jia Ning*, Yue Cao, Yixuan Wei, Zheng Zhang, Stephen Lin and Han Hu. This repo is the official implementation of "V

Swin Transformer 981 Jan 03, 2023
MarcoPolo is a clustering-free approach to the exploration of bimodally expressed genes along with group information in single-cell RNA-seq data

MarcoPolo is a method to discover differentially expressed genes in single-cell RNA-seq data without depending on prior clustering Overview MarcoPolo

Chanwoo Kim 13 Dec 18, 2022
WTTE-RNN a framework for churn and time to event prediction

WTTE-RNN Weibull Time To Event Recurrent Neural Network A less hacky machine-learning framework for churn- and time to event prediction. Forecasting p

Egil Martinsson 727 Dec 28, 2022
Stream images from a connected camera over MQTT, view using Streamlit, record to file and sqlite

mqtt-camera-streamer Summary: Publish frames from a connected camera or MJPEG/RTSP stream to an MQTT topic, and view the feed in a browser on another

Robin Cole 183 Dec 16, 2022
Pytorch implementation of Learning with Opponent-Learning Awareness

Pytorch implementation of Learning with Opponent-Learning Awareness using DiCE

Alexis David Jacq 82 Sep 15, 2022
Resilient projection-based consensus actor-critic (RPBCAC) algorithm

Resilient projection-based consensus actor-critic (RPBCAC) algorithm We implement the RPBCAC algorithm with nonlinear approximation from [1] and focus

Martin Figura 5 Jul 12, 2022