A learning-based data collection tool for human segmentation

Last update: Jun 24, 2022

Overview

FullBodyFilter

A Learning-Based Data Collection Tool For Human Segmentation

Overview

Human segmentation is a difficult machine learning task of identifying and extracting the human in a picture. Most of the time this is done by using a convolutional neural network. In order to achieve an accurate and robust model, large amounts of data with varying human poses need to be collected to train the model. Collecting and labeling train data by hand takes lots of time and resources. This project explores another option to use automtation to collect and label pre-existing data from internet videos.

The model that was focused on is the DTEN ME model used for Zoom meetings virtual background.

Openpose is used to filter the video for suitable frames, in particular single person full body frames. Mask R-CNN is the teacher model that generates training labels. To find which images perform poorly on ME model, a comparison is done between ME masks and Mask R-CNN masks. The result is a set of images and masks that can be used as training data.

Overview of Program

A full report of the system design and implemenation details can be found in doc

Sample Results

Examples of train data saved. In each image bottom left is Mask R-CNN mask and bottom right is ME mask.

Usage

This project relies on Openpose and Mask R-CNN and all their dependencies. Instructions on how to set up each are found in there respective directories here.

Documentation on how to use scripts are located in doc.

A learning-based data collection tool for human segmentation

Related tags

Overview

FullBodyFilter

Contents

Overview

Sample Results

Usage

Owner

Robert Jiang

"Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion"(WWW 2021)

FluxTraining.jl gives you an endlessly extensible training loop for deep learning

Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking."

Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

Deep ViT Features as Dense Visual Descriptors

A Unified Generative Framework for Various NER Subtasks.

Real time sign language recognition

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

Official PyTorch repo for JoJoGAN: One Shot Face Stylization

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos

Pytorch implementation for "Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets" (ECCV 2020 Spotlight)

AI创造营：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Understanding Hyperdimensional Computing for Parallel Single-Pass Learning

Sharpened cosine similarity torch - A Sharpened Cosine Similarity layer for PyTorch

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

official implemntation for "Contrastive Learning with Stronger Augmentations"

A learning-based data collection tool for human segmentation

Related tags

Overview

FullBodyFilter

Contents

Overview

Sample Results

Usage

Owner

Robert Jiang

"Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion"(WWW 2021)

FluxTraining.jl gives you an endlessly extensible training loop for deep learning

Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking."

Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

Deep ViT Features as Dense Visual Descriptors

A Unified Generative Framework for Various NER Subtasks.

Real time sign language recognition

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

Official PyTorch repo for JoJoGAN: One Shot Face Stylization

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos

Pytorch implementation for "Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets" (ECCV 2020 Spotlight)

AI创造营 ：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Understanding Hyperdimensional Computing for Parallel Single-Pass Learning

Sharpened cosine similarity torch - A Sharpened Cosine Similarity layer for PyTorch

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

official implemntation for "Contrastive Learning with Stronger Augmentations"

AI创造营：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人