[SIGGRAPH 2022 Journal Track] AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

Overview

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

1S-Lab, Nanyang Technological University  2SenseTime Research  3Shanghai AI Laboratory
*equal contribution  +corresponding author

Accepted to SIGGRAPH 2022 (Journal Track)

TL;DR

AvatarCLIP generate and animate avatars given descriptions of body shapes, appearances and motions.

A tall and skinny female soldier that is arguing. A skinny ninja that is raising both arms. An overweight sumo wrestler that is sitting. A tall and fat Iron Man that is running.

This repository contains the official implementation of AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars.


[Project Page][arXiv][High-Res PDF (166M)][Supplementary Video][Colab Demo]

Updates

[05/2022] Paper uploaded to arXiv. arXiv

[05/2022] Add a Colab Demo for avatar generation! Open In Colab

[05/2022] Support converting the generated avatar to the animatable FBX format! Go checkout how to use the FBX models. Or checkout the instructions for the conversion codes.

[05/2022] Code release for avatar generation part!

[04/2022] AvatarCLIP is accepted to SIGGRAPH 2022 (Journal Track) 🥳 !

Citation

If you find our work useful for your research, please consider citing the paper:

@article{hong2022avatarclip,
    title={AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars},
    author={Hong, Fangzhou and Zhang, Mingyuan and Pan, Liang and Cai, Zhongang and Yang, Lei and Liu, Ziwei},
    journal={ACM Transactions on Graphics (TOG)},
    volume={41},
    number={4},
    articleno={161},
    pages={1--19},
    year={2022},
    publisher={ACM New York, NY, USA},
    doi={10.1145/3528223.3530094},
}

Use Generated FBX Models

Download

Go visit our project page. Go to the section 'Avatar Gallery'. Pick a model you like. Click 'Load Model' below. Click 'Download FBX' link at the bottom of the pop-up viewer.

Import to Your Favourite 3D Software (e.g. Blender, Unity3D)

The FBX models are already rigged. Use your motion library to animate it!

Upload to Mixamo

To make use of the rich motion library provided by Mixamo, you can also upload the FBX model to Mixamo. The rigging process is completely automatic!

Installation

We recommend using anaconda to manage the python environment. The setup commands below are provided for your reference.

git clone https://github.com/hongfz16/AvatarCLIP.git
cd AvatarCLIP
conda create -n AvatarCLIP python=3.7
conda activate AvatarCLIP
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=10.1 -c pytorch
pip install -r requirements.txt

Other than the above steps, you should also install neural_renderer following its instructions. Before compiling neural_renderer (or after compiling should also be fine), remember to add the following three lines to neural_renderer/perspective.py after line 19.

x[z<=0] = 0
y[z<=0] = 0
z[z<=0] = 0

This quick fix is for a rendering issue where objects behide the camera will also be rendered. Be careful when using this fixed version of neural_renderer on your other projects, because this fix will cause the rendering process not differentiable.

Data Preparation

Download SMPL Models

Register and download SMPL models here. Put the downloaded models in the folder smpl_models. The folder structure should look like

./
├── ...
└── smpl_models/
    ├── smpl/
        ├── SMPL_FEMALE.pkl
        ├── SMPL_MALE.pkl
        └── SMPL_NEUTRAL.pkl

Download Pretrained Models & Other Data

This download is only for coarse shape generation. You can skip if you only want to use other parts. Download the pretrained weights and other required data here. Put them in the folder AvatarGen so that the folder structure should look like

./
├── ...
└── AvatarGen/
    └── ShapeGen/
        └── data/
            ├── codebook.pth
            ├── model_VAE_16.pth
            ├── nongrey_male_0110.jpg
            ├── smpl_uv.mtl
            └── smpl_uv.obj

Avatar Generation

Coarse Shape Generation

Folder AvatarGen/ShapeGen contains codes for this part. Run the follow command to generate the coarse shape corresponding to the shape description 'a strong man'. We recommend to use the prompt augmentation 'a 3d rendering of xxx in unreal engine' for better results. The generated coarse body mesh will be stored under AvatarGen/ShapeGen/output/coarse_shape.

python main.py --target_txt 'a 3d rendering of a strong man in unreal engine'

Then we need to render the mesh for initialization of the implicit avatar representation. Use the following command for rendering.

python render.py --coarse_shape_obj output/coarse_shape/a_3d_rendering_of_a_strong_man_in_unreal_engine.obj --output_folder ${RENDER_FOLDER}

Shape Sculpting and Texture Generation

Note that all the codes are tested on NVIDIA V100 (32GB memory). Therefore, in order to run on GPUs with lower memory, please try to scale down the network or tune down max_ray_num in the config files. You can refer to confs/examples_small/example.conf or our colab demo for a scale-down version of AvatarCLIP.

Folder AvatarGen/AppearanceGen contains codes for this part. We provide data, pretrained model and scripts to perform shape sculpting and texture generation on a zero-beta body (mean shape defined by SMPL). We provide many example scripts under AvatarGen/AppearanceGen/confs/examples. For example, if we want to generate 'Abraham Lincoln', which is defined in the config file confs/examples/abrahamlincoln.conf, use the following command.

python main.py --mode train_clip --conf confs/examples/abrahamlincoln.conf

Results will be stored in AvatarCLIP/AvatarGen/AppearanceGen/exp/smpl/examples/abrahamlincoln.

If you wish to perform shape sculpting and texture generation on the previously generated coarse shape. We also provide example config files in confs/base_models/astrongman.conf confs/astrongman/*.conf. Two steps of optimization are required as follows.

# Initilization of the implicit avatar
python main.py --mode train --conf confs/base_models/astrongman.conf
# Shape sculpting and texture generation on the initialized implicit avatar
python main.py --mode train_clip --conf confs/astrongman/hulk.conf

Marching Cube

To extract meshes from the generated implicit avatar, one may use the following command.

python main.py --mode validate_mesh --conf confs/examples/abrahamlincoln.conf

The final high resolution mesh will be stored as AvatarCLIP/AvatarGen/AppearanceGen/exp/smpl/examples/abrahamlincoln/meshes/00030000.ply

Convert Avatar to FBX Format

For the convenience of using the generated avatar with modern graphics pipeline, we also provide scripts to rig the avatar and convert to FBX format. See the instructions here.

Motion Generation

TBA

License

Distributed under the MIT License. See LICENSE for more information.

Related Works

There are lots of wonderful works that inspired our work or came around the same time as ours.

Dream Fields enables zero-shot text-driven general 3D object generation using CLIP and NeRF.

Text2Mesh proposes to edit a template mesh by predicting offsets and colors per vertex using CLIP and differentiable rendering.

CLIP-NeRF can manipulate 3D objects represented by NeRF with natural languages or examplar images by leveraging CLIP.

Text to Mesh facilitates zero-shot text-driven general mesh generation by deforming from a sphere mesh guided by CLIP.

MotionCLIP establishes a projection from the CLIP text space to the motion space through supervised training, which leads to amazing text-driven motion generation results.

Acknowledgements

This study is supported by NTU NAP, MOE AcRF Tier 2 (T2EP20221-0033), and under the RIE2020 Industry Alignment Fund – Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s).

We thank the following repositories for their contributions in our implementation: NeuS, smplx, vposer, Smplx2FBX.

A 3D sparse LBM solver implemented using Taichi

taichi_LBM3D Background Taichi_LBM3D is a 3D lattice Boltzmann solver with Multi-Relaxation-Time collision scheme and sparse storage structure impleme

Jianhui Yang 121 Jan 06, 2023
Repo for FUZE project. I will also publish some Linux kernel LPE exploits for various real world kernel vulnerabilities here. the samples are uploaded for education purposes for red and blue teams.

Linux_kernel_exploits Some Linux kernel exploits for various real world kernel vulnerabilities here. More exploits are yet to come. This repo contains

Wei Wu 472 Dec 21, 2022
Code for CPM-2 Pre-Train

CPM-2 Pre-Train Pre-train CPM-2 此分支为110亿非 MoE 模型的预训练代码,MoE 模型的预训练代码请切换到 moe 分支 CPM-2技术报告请参考link。 0 模型下载 请在智源资源下载页面进行申请,文件介绍如下: 文件名 描述 参数大小 100000.tar

Tsinghua AI 136 Dec 28, 2022
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples (WACV 2022) and Beyond Simple Meta-Learning: Multi-Purpose Model

PLAI Group at UBC 42 Dec 06, 2022
TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

TensorFlow Examples This tutorial was designed for easily diving into TensorFlow, through examples. For readability, it includes both notebooks and so

Aymeric Damien 42.5k Jan 08, 2023
Detection of drones using their thermal signatures from thermal camera through YOLO-V3 based CNN with modifications to encapsulate drone motion

Drone Detection using Thermal Signature This repository highlights the work for night-time drone detection using a using an Optris PI Lightweight ther

Chong Yu Quan 6 Dec 31, 2022
PyTorch implementation of normalizing flow models

PyTorch implementation of normalizing flow models

Vincent Stimper 242 Jan 02, 2023
Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs

Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs In this work, we propose an algorithm DP-SCAFFOLD(-warm), whic

19 Nov 10, 2022
StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

3k Jan 08, 2023
DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs

DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs Abstract: Image-to-image translation has recently achieved re

yaxingwang 23 Apr 14, 2022
Use tensorflow to implement a Deep Neural Network for real time lane detection

LaneNet-Lane-Detection Use tensorflow to implement a Deep Neural Network for real time lane detection mainly based on the IEEE IV conference paper "To

MaybeShewill-CV 1.9k Jan 08, 2023
CLIP (Contrastive Language–Image Pre-training) for Italian

Italian CLIP CLIP (Radford et al., 2021) is a multimodal model that can learn to represent images and text jointly in the same space. In this project,

Italian CLIP 114 Dec 29, 2022
Hypernetwork-Ensemble Learning of Segmentation Probability for Medical Image Segmentation with Ambiguous Labels

Hypernet-Ensemble Learning of Segmentation Probability for Medical Image Segmentation with Ambiguous Labels The implementation of Hypernet-Ensemble Le

Sungmin Hong 6 Jul 18, 2022
A PaddlePaddle implementation of STGCN with a few modifications in the model architecture in order to forecast traffic jam.

About This repository contains the code of a PaddlePaddle implementation of STGCN based on the paper Spatio-Temporal Graph Convolutional Networks: A D

Tianjian Li 1 Jan 11, 2022
A Python package for faster, safer, and simpler ML processes

Bender 🤖 A Python package for faster, safer, and simpler ML processes. Why use bender? Bender will make your machine learning processes, faster, safe

Otovo 6 Dec 13, 2022
Container : Context Aggregation Network

Container : Context Aggregation Network If you use this code for a paper please cite: @article{gao2021container, title={Container: Context Aggregati

AI2 47 Dec 16, 2022
Collection of generative models in Tensorflow

tensorflow-generative-model-collections Tensorflow implementation of various GANs and VAEs. Related Repositories Pytorch version Pytorch version of th

3.8k Dec 30, 2022
Companion repository to the paper accepted at the 4th ACM SIGSPATIAL International Workshop on Advances in Resilient and Intelligent Cities

Transfer learning approach to bicycle sharing systems station location planning using OpenStreetMap Companion repository to the paper accepted at the

Politechnika Wrocławska - repozytorium dla informatyków 4 Oct 24, 2022
Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Bidirectional Projection Network for Cross Dimension Scene Understanding CVPR 2021 (Oral) [ Project Webpage ] [ arXiv ] [ Video ] Existing segmentatio

Hu Wenbo 135 Dec 26, 2022
Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Adaptive Class Suppression Loss for Long-Tail Object Detection This repo is the official implementation for CVPR 2021 paper: Adaptive Class Suppressio

CASIA-IVA-Lab 67 Dec 04, 2022