A learning-based data collection tool for human segmentation

Last update: Jun 24, 2022

Overview

FullBodyFilter

A Learning-Based Data Collection Tool For Human Segmentation

Overview

Human segmentation is a difficult machine learning task of identifying and extracting the human in a picture. Most of the time this is done by using a convolutional neural network. In order to achieve an accurate and robust model, large amounts of data with varying human poses need to be collected to train the model. Collecting and labeling train data by hand takes lots of time and resources. This project explores another option to use automtation to collect and label pre-existing data from internet videos.

The model that was focused on is the DTEN ME model used for Zoom meetings virtual background.

Openpose is used to filter the video for suitable frames, in particular single person full body frames. Mask R-CNN is the teacher model that generates training labels. To find which images perform poorly on ME model, a comparison is done between ME masks and Mask R-CNN masks. The result is a set of images and masks that can be used as training data.

Overview of Program

A full report of the system design and implemenation details can be found in doc

Sample Results

Examples of train data saved. In each image bottom left is Mask R-CNN mask and bottom right is ME mask.

Usage

This project relies on Openpose and Mask R-CNN and all their dependencies. Instructions on how to set up each are found in there respective directories here.

Documentation on how to use scripts are located in doc.

A learning-based data collection tool for human segmentation

Related tags

Overview

FullBodyFilter

Contents

Overview

Sample Results

Usage

Owner

Robert Jiang

Rename Images with Auto Generated Neural Image Captions

GarmentNets: Category-Level Pose Estimation for Garments via Canonical Space Shape Completion

TensorFlow port of PyTorch Image Models (timm) - image models with pretrained weights.

Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"

Codes and models of NeurIPS2021 paper - DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks

The official repository for Deep Image Matting with Flexible Guidance Input

LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image.

DABO: Data Augmentation with Bilevel Optimization

TriMap: Large-scale Dimensionality Reduction Using Triplets

Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

LIVECell - A large-scale dataset for label-free live cell segmentation

Scaling and Benchmarking Self-Supervised Visual Representation Learning

Bridging Composite and Real: Towards End-to-end Deep Image Matting

This repo provides function call to track multi-objects in videos

SSD-based Object Detection in PyTorch

Code of Adverse Weather Image Translation with Asymmetric and Uncertainty aware GAN

Pose estimation for iOS and android using TensorFlow 2.0

Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data

From the basics to slightly more interesting applications of Tensorflow

PyTorch code for the paper "FIERY: Future Instance Segmentation in Bird's-Eye view from Surround Monocular Cameras"