Tensorflow python implementation of "Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos"

Last update: Jan 06, 2023

Related tags

Deep Learning HDNet_TikTok

Overview

Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos

This repository is the official tensorflow python implementation of "Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos" in CVPR 2021 (Oral Presentation) (Best Paper Nominated).

Project Page
TikTok Dataset

This codebase provides:

Inference code
Training code
Visualization code

Requirements

(This code is tested with tensorflow-gpu 1.14.0, Python 3.7.4, CUDA 10 (version 10.0.130) and cuDNN 7 (version 7.4.2).)

numpy
imageio
matplotlib
scikit-image
scipy==1.1.0
tensorflow-gpu==1.14.0
gast==0.2.2
Pillow

Installation

Run the following code to install all pip packages:

pip install -r requirements.txt

In case there is a problem, you can use the following tensorflow docker container "(tensorflow:19.02-py3)":

sudo docker run --gpus all -it --rm -v local_dir:container_dir nvcr.io/nvidia/tensorflow:19.02-py3

Then install the requirements:

pip install -r requirements.txt

Inference Demo

Input:

The test data dimension should be: 256x256. For any test data you should have 3 .png files: (For an example please take a look at the demo data in "test_data" folder.)

name_img.png : The 256x256x3 test image
name_mask.png : The 256x256 corresponding mask. You can use any off-the-shelf tools such as removebg to remove the background and get the mask.
name_dp.png : The 256x256x3 corresponding DensePose.

Output:

Running the demo generates the following:

name.txt : The 256x256 predicted depth
name_mesh.obj : The reconstructed mesh. You can use any off-the-shelf tools such as MeshLab to visualize the mesh. Visualization for demo data from different views:

name_normal_1.txt, name_normal_2.txt, name_normal_3.txt : Three 256x256 predicted normal. If you concatenate them in the third axis it will give you the 256x256x3 normal map.
name_results.png : visualization of predicted depth heatmap and the predicted normal map. Visualization for demo data:

Run the demo:

Download the weights from here and extract in the main repository or run this in the main repository:

wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1UOHkmwcWpwt9r11VzOCa_CVamwHVaobV' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1UOHkmwcWpwt9r11VzOCa_CVamwHVaobV" -O model.zip && rm -rf /tmp/cookies.txt

unzip model.zip

Run the following python code:

python HDNet_Inference.py

From line 26 to 29 under "test path and outpath" you can choose the input directory (default: './test_data'), ouput directory (default: './test_data/infer_out') and if you want to save the visualization (default: True).

More Results

Training

To train the network, go to training folder and read the README file

MATLAB Visualization

If you want to generate visualizations similar to those on the website, go to MATLAB_Visualization folder and run

make_video.m

From lines 7 to 14, you can choose the test folder (default: test_data) and the image name to process (default: 0043). This will generate a video of the prediction from different views (default: "test_data/infer_out/video/0043/video.avi") This process will take around 2 minutes to generate 164 angles.

Note that this visualization will always generate a 672 × 512 video, You may want to resize your video accordingly for your own tested data.

Citation

If you find the code or our dataset useful in your research, please consider citing the paper.

@InProceedings{Jafarian_2021_CVPR_TikTok,
    author    = {Jafarian, Yasamin and Park, Hyun Soo},
    title     = {Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {12753-12762}}

Tensorflow python implementation of "Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos"

Related tags

Overview

Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos

Requirements

Installation

Inference Demo

Input:

Output:

Run the demo:

More Results

Training

MATLAB Visualization

Citation

Owner

Yasamin Jafarian

High-resolution networks and Segmentation Transformer for Semantic Segmentation

Code release for General Greedy De-bias Learning

A Gura parser implementation for Python

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

Complete-IoU (CIoU) Loss and Cluster-NMS for Object Detection and Instance Segmentation (YOLACT)

On Generating Extended Summaries of Long Documents

Experiments on Flood Segmentation on Sentinel-1 SAR Imagery with Cyclical Pseudo Labeling and Noisy Student Training

Generating Anime Images by Implementing Deep Convolutional Generative Adversarial Networks paper

source code the paper Fast and Robust Iterative Closet Point.

Project for music generation system based on object tracking and CGAN

Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

Computer-Vision-Paper-Reviews - Computer Vision Paper Reviews with Key Summary along Papers & Codes

UNet model with VGG11 encoder pre-trained on Kaggle Carvana dataset

This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

Smart edu-autobooking - Johnson @ DMI-UNICT study room self-booking system

PyTorch implementation of MoCo v3 for self-supervised ResNet and ViT.

Paddle-Skeleton-Based-Action-Recognition - DecoupleGCN-DropGraph, ASGCN, AGCN, STGCN

This was initially the repo for the project of [email protected] of Asaf Mazar, Millad Kassaie and Georgios Chochlakis named "Powered by the Will? Exploring Lay Theories of Behavior Change through Social Media"

Implementation of Memory-Efficient Neural Networks with Multi-Level Generation, ICCV 2021

Benchmark datasets, data loaders, and evaluators for graph machine learning

Tensorflow python implementation of "Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos"

Related tags

Overview

Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos

Requirements

Installation

Inference Demo

Input:

Output:

Run the demo:

More Results

Training

MATLAB Visualization

Citation

Owner

Yasamin Jafarian

High-resolution networks and Segmentation Transformer for Semantic Segmentation

Code release for General Greedy De-bias Learning

A Gura parser implementation for Python

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

Complete-IoU (CIoU) Loss and Cluster-NMS for Object Detection and Instance Segmentation (YOLACT)

On Generating Extended Summaries of Long Documents

Experiments on Flood Segmentation on Sentinel-1 SAR Imagery with Cyclical Pseudo Labeling and Noisy Student Training

Generating Anime Images by Implementing Deep Convolutional Generative Adversarial Networks paper

source code the paper Fast and Robust Iterative Closet Point.

Project for music generation system based on object tracking and CGAN

Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

Computer-Vision-Paper-Reviews - Computer Vision Paper Reviews with Key Summary along Papers & Codes

UNet model with VGG11 encoder pre-trained on Kaggle Carvana dataset

​ This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

Smart edu-autobooking - Johnson @ DMI-UNICT study room self-booking system

PyTorch implementation of MoCo v3 for self-supervised ResNet and ViT.

Paddle-Skeleton-Based-Action-Recognition - DecoupleGCN-DropGraph, ASGCN, AGCN, STGCN

This was initially the repo for the project of [email protected] of Asaf Mazar, Millad Kassaie and Georgios Chochlakis named "Powered by the Will? Exploring Lay Theories of Behavior Change through Social Media"

Implementation of Memory-Efficient Neural Networks with Multi-Level Generation, ICCV 2021

Benchmark datasets, data loaders, and evaluators for graph machine learning

This is the Pytorch implementation of Progressive Attentional Manifold Alignment.