4D Human Body Capture from Egocentric Video via 3D Scene Grounding

Overview

4D Human Body Capture from Egocentric Video via 3D Scene Grounding

[Project] [Paper]

Installation:

Our method requires the same dependencies as SMPLify-X and OpenPose. We refer to the official implementation fo SMPLify-X and OpenPose for installation details.

Our method also needs the installation of Chamfer Pytorch to calculate the chamfer distnace for enforceing human-scene constraints

Data Preparation:

Step 1: Dump video frames with desired fps (30) with utils/dump_videos.py. Run utils/split_frames to segment videos into equally long subatom clips. Repack frames to videos with utils/pack_videos.py (This is for faster openpose I/O).

Step 2: Run openpose_call.py under openpose folder to get human body keypoints, then run utils/openpose_helper to rename keypoint.json and run utils/openpose_filter.py to keep the most confident human keypoints.

Step 3: Run Smplify-X model with specified focal length and data directory. This step may take up to several hours. For instance:

python3 smplifyx/main.py --config cfg_files/fit_smplx.yaml  --data_folder /home/miao/data/rylm/downsampled_frames/miao_mainbuilding_0-1 --output_folder /home/miao/data/rylm/downsampled_frames/miao_mainbuilding_0-1/body_gen --visualize="False" --model_folder ./models --vposer_ckpt ./vposer --part_segm_fn smplx_parts_segm.pkl --focal_length 694.0

Step 4: Run Colmap for to generate scene mesh and camera trajectory. This step make take up to several hours depneding on the complexity of the scene. Then Run utils/camerpose_helper and utils/pointscloud_helper.py to generate desired points cloud file and camera pose.

Joint Optimization with 3D Scene Context:

Run global_optimization.py to conduct temproal smoothing and enforce human-scene constraints:

python3 global_optimization.py '/home/miao/data/rylm/packed_data/miao_mainbuidling_0-1/body_gen' '/home/miao/data/rylm/packed_data/miao_mainbuidling_0-1/smoothed_body

The resulting data should be organized as following:

  • datafolder:
    • videoname:
      • images: folder that contains all video frames
      • keypoints: folder that contains all body keypoints
      • body_gen: folder that contains all body mesh files:
      • smoothed_boyd: folder that contains all jointly-optimized body mesh files:
      • camera_pose.txt: text file that contains camera pose at each temporal footprint
      • meshed-poisson.ply: scene mesh file from dense reconstruction
      • camera.txt: text file that contains camera parameters
      • xyz.ply point cloud file. (use meash lab to convert .xyz file to .ply file)

Visualization in the World Coordinate:

Run global_vis.py to transform the body mesh in pivot coordinate to world coordinate. By default the viewpoint of open3d is the initial position camera trajectory. Setting bool flag to 'True' will resulting into a open3d viewpoint moving the same way as camera viewer.

python3 global_vis.py '/home/miao/data/rylm/downsampled_frames/miao_mainbuilding_0-1/' False

Visualization in the Egocentric Coordinate:

Run vis.py to view recosntrcuted body mesh on image plane.

python3 vis.py '/home/miao/data/rylm/segmented_data/miao_mainbuilding_0-1/'

Citation

If you find our code useful in your research, please use the following BibTeX entry for citation.

@inproceedings{liu20204d,
  title={4D Human Body Capture from Egocentric Video via 3D Scene Grounding},
  author={Liu, Miao and Yang, Dexin and Zhang, Yan and Cui, Zhaopeng and Rehg, James M and Tang, Siyu},
  booktitle={3DV},
  year={2021}
}
Owner
Miao Liu
Miao Liu
scikit-learn: machine learning in Python

scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. The project was started

scikit-learn 52.5k Jan 08, 2023
⚡ H2G-Net for Semantic Segmentation of Histopathological Images

H2G-Net This repository contains the code relevant for the proposed design H2G-Net, which was introduced in the manuscript "Hybrid guiding: A multi-re

André Pedersen 8 Nov 24, 2022
The Curious Layperson: Fine-Grained Image Recognition without Expert Labels (BMVC 2021)

The Curious Layperson: Fine-Grained Image Recognition without Expert Labels Subhabrata Choudhury, Iro Laina, Christian Rupprecht, Andrea Vedaldi Code

Subhabrata Choudhury 18 Dec 27, 2022
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

Multimodal Deep Learning 🎆 🎆 🎆 Announcing the multimodal deep learning repository that contains implementation of various deep learning-based model

Deep Cognition and Language Research (DeCLaRe) Lab 398 Dec 30, 2022
Image marine sea litter prediction Shiny

MARLITE Shiny app for floating marine litter detection in aerial images. This directory contains the instructions and software needed to install the S

19 Dec 22, 2022
Multi-agent reinforcement learning algorithm and environment

Multi-agent reinforcement learning algorithm and environment [en/cn] Pytorch implements multi-agent reinforcement learning algorithms including IQL, Q

万鲲鹏 7 Sep 20, 2022
Direct design of biquad filter cascades with deep learning by sampling random polynomials.

IIRNet Direct design of biquad filter cascades with deep learning by sampling random polynomials. Usage git clone https://github.com/csteinmetz1/IIRNe

Christian J. Steinmetz 55 Nov 02, 2022
A Traffic Sign Recognition Project which can help the driver recognise the signs via text as well as audio. Can be used at Night also.

Traffic-Sign-Recognition In this report, we propose a Convolutional Neural Network(CNN) for traffic sign classification that achieves outstanding perf

Mini Project 64 Nov 19, 2022
ANN model for prediction a spatio-temporal distribution of supercooled liquid in mixed-phase clouds using Doppler cloud radar spectra.

VOODOO Revealing supercooled liquid beyond lidar attenuation Explore the docs » Report Bug · Request Feature Table of Contents About The Project Built

remsens-lim 2 Apr 28, 2022
Search and filter videos based on objects that appear in them using convolutional neural networks

Thingscoop: Utility for searching and filtering videos based on their content Description Thingscoop is a command-line utility for analyzing videos se

Anastasis Germanidis 354 Dec 04, 2022
A light-weight image labelling tool for Python designed for creating segmentation data sets.

An image labelling tool for creating segmentation data sets, for Django and Flask.

117 Nov 21, 2022
这是一个unet-pytorch的源码,可以训练自己的模型

Unet:U-Net: Convolutional Networks for Biomedical Image Segmentation目标检测模型在Pytorch当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Downl

Bubbliiiing 567 Jan 05, 2023
Tensorflow implementation for Self-supervised Graph Learning for Recommendation

If the compilation is successful, the evaluator of cpp implementation will be called automatically. Otherwise, the evaluator of python implementation will be called.

152 Jan 07, 2023
QA-GNN: Question Answering using Language Models and Knowledge Graphs

QA-GNN: Question Answering using Language Models and Knowledge Graphs This repo provides the source code & data of our paper: QA-GNN: Reasoning with L

Michihiro Yasunaga 434 Jan 04, 2023
DeFMO: Deblurring and Shape Recovery of Fast Moving Objects (CVPR 2021)

Evaluation, Training, Demo, and Inference of DeFMO DeFMO: Deblurring and Shape Recovery of Fast Moving Objects (CVPR 2021) Denys Rozumnyi, Martin R. O

Denys Rozumnyi 139 Dec 26, 2022
Official code for our EMNLP2021 Outstanding Paper MindCraft: Theory of Mind Modeling for Situated Dialogue in Collaborative Tasks

MindCraft Authors: Cristian-Paul Bara*, Sky CH-Wang*, Joyce Chai This is the official code repository for the paper (arXiv link): Cristian-Paul Bara,

Situated Language and Embodied Dialogue (SLED) Research Group 14 Dec 29, 2022
A library for graph deep learning research

Documentation | Paper [JMLR] | Tutorials | Benchmarks | Examples DIG: Dive into Graphs is a turnkey library for graph deep learning research. Why DIG?

DIVE Lab, Texas A&M University 1.3k Jan 01, 2023
Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts

Face mask detection Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts in order to detect face masks in static im

Vaibhav Shukla 1 Oct 27, 2021
Official implementation of Deep Burst Super-Resolution

Deep-Burst-SR Official implementation of Deep Burst Super-Resolution Publication: Deep Burst Super-Resolution. Goutam Bhat, Martin Danelljan, Luc Van

Goutam Bhat 113 Dec 19, 2022
Domain Adaptation with Invariant RepresentationLearning: What Transformations to Learn?

Domain Adaptation with Invariant RepresentationLearning: What Transformations to Learn? Repository Structure: DSAN |└───amazon |    └── dataset (Amazo

DMIRLAB 17 Jan 04, 2023