[ICCV-2021] An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

Overview

An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation (ICCV 2021)

Introduction

This is an official pytorch implementation of An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation. [ICCV 2021] PDF

Abstract

Most semi-supervised learning models are consistency-based, which leverage unlabeled images by maximizing the similarity between different augmentations of an image. But when we apply them to human pose estimation that has extremely imbalanced class distribution, they often collapse and predict every pixel in unlabeled images as background. We find this is because the decision boundary passes the high-density areas of the minor class so more and more pixels are gradually mis-classified as background.

In this work, we present a surprisingly simple approach to drive the model. For each image, it composes a pair of easy-hard augmentations and uses the more accurate predictions on the easy image to teach the network to learn pose information of the hard one. The accuracy superiority of teaching signals allows the network to be “monotonically” improved which effectively avoids collapsing. We apply our method to the state-of-the-art pose estimators and it further improves their performance on three public datasets.

Main Results

1. Semi-Supervised Setting

Results on COCO Val2017

Method Augmentation 1K Labels 5K Labels 10K Labels
Supervised Affine 31.5 46.4 51.1
PoseCons (Single) Affine 38.5 50.5 55.4
PoseCons (Single) Affine + Joint Cutout 42.1 52.3 57.3
PoseDual (Dual) Affine 41.5 54.8 58.7
PoseDual (Dual) Affine + RandAug 43.7 55.4 59.3
PoseDual (Dual) Affine + Joint Cutout 44.6 55.6 59.6

We use COCO Subset (1K, 5K and 10K) and TRAIN as labeled and unlabeled datasets, respectively

Note:

  • The Ground Truth person boxes is used
  • No flipping test is used.

2. Full labels Setting

Results on COCO Val2017

Method Network AP AP.5 AR
Supervised ResNet50 70.9 91.4 74.2
PoseDual ResNet50 73.9 (↑3.0) 92.5 77.0
Supervised HRNetW48 77.2 93.5 79.9
PoseDual HRNetW48 79.2 (↑2.0) 94.6 81.7

We use COCO TRAIN and WILD as labeled and unlabeled datasets, respectively

Pretrained Models

Download Links Google Drive

Environment

The code is developed using python 3.7 on Ubuntu 16.04. NVIDIA GPUs are needed.

Quick start

Installation

  1. Install pytorch >= v1.2.0 following official instruction.

  2. Clone this repo, and we'll call the directory that you cloned as ${POSE_ROOT}.

  3. Install dependencies:

    pip install -r requirements.txt
    
  4. Make libs:

    cd ${POSE_ROOT}/lib
    make
    
  5. Init output(training model output directory)::

     mkdir output 
     mkdir log
    
  6. Download pytorch imagenet pretrained models from Google Drive. The PoseDual (ResNet18) should load resnet18_5c_gluon_posedual as pretrained for training,

  7. Download our pretrained models from Google Drive

    ${POSE_ROOT}
     `-- models
         `-- pytorch
             |-- imagenet
             |   |-- resnet18_5c_f3_posedual.pth
             |   |-- resnet18-5c106cde.pth
             |   |-- resnet50-19c8e357.pth
             |   |-- resnet101-5d3b4d8f.pth
             |   |-- resnet152-b121ed2d.pth
             |   |-- ......
             |-- pose_dual
                 |-- COCO_subset
                 |   |-- COCO1K_PoseDual.pth.tar
                 |   |-- COCO5K_PoseDual.pth.tar
                 |   |-- COCO10K_PoseDual.pth.tar
                 |   |-- ......
                 |-- COCO_COCOwild
                 |-- ......
    

Data preparation

For COCO and MPII dataset, Please refer to Simple Baseline to prepare them.
Download Person Detection Boxes and Images for COCO WILD (unlabeled) set. The structure looks like this:

${POSE_ROOT}
|-- data
`-- |-- coco
    `-- |-- annotations
        |   |-- person_keypoints_train2017.json
        |   |-- person_keypoints_val2017.json
        |   `__ image_info_unlabeled2017.json
        |-- person_detection_results
        |   |-- COCO_val2017_detections_AP_H_56_person.json
        |   |-- COCO_test-dev2017_detections_AP_H_609_person.json
        |   `-- COCO_unlabeled2017_detections_person_faster_rcnn.json
        `-- images
            |-- train2017
            |   |-- 000000000009.jpg
            |   |-- 000000000025.jpg
            |   |-- ... 
            `-- val2017
                |-- 000000000139.jpg
                |-- 000000000285.jpg
                |-- ... 

For AIC data, please download from AI Challenger 2017, 2017 Train/Val is needed for keypoints training and validation. Please download the annotation files from AIC Annotations. The structure looks like this:

${POSE_ROOT}
|-- data
`-- |-- ai_challenger
    `-- |-- train
        |   |-- images
        |   `-- keypoint_train_annotation.json
        `-- validation
            |-- images
            |   |-- 0a00c0b5493774b3de2cf439c84702dd839af9a2.jpg
            |   |-- 0a0c466577b9d87e0a0ed84fc8f95ccc1197f4b0.jpg
            |   `-- ...
            |-- gt_valid.mat
            `-- keypoint_validation_annotation.json

Run

Training

1. Training Dual Networks (PoseDual) on COCO 1K labels

python pose_estimation/train.py \
    --cfg experiments/mix_coco_coco/res18/256x192_COCO1K_PoseDual.yaml

2. Training Dual Networks on COCO 1K labels with Joint Cutout

python pose_estimation/train.py \
    --cfg experiments/mix_coco_coco/res18/256x192_COCO1K_PoseDual_JointCutout.yaml

3.Training Dual Networks on COCO 1K labels with Distributed Data Parallel

python -m torch.distributed.launch --nproc_per_node=4  pose_estimation/train.py \
    --distributed --cfg experiments/mix_coco_coco/res18/256x192_COCO1K_PoseDual.yaml

4. Training Single Networks (PoseCons) on COCO 1K labels

python pose_estimation/train.py \
    --cfg experiments/mix_coco_coco/res18/256x192_COCO1K_PoseCons.yaml

5. Training Dual Networks (PoseDual) with ResNet50 on COCO TRAIN + WILD

python pose_estimation/train.py \
    --cfg experiments/mix_coco_coco/res50/256x192_COCO_COCOunlabel_PoseDual_JointCut.yaml

Testing

6. Testing Dual Networks (PoseDual+COCO1K) on COCO VAL

python pose_estimation/valid.py \
    --cfg experiments/mix_coco_coco/res18/256x192_COCO1K_PoseDual.yaml

Citation

If you use our code or models in your research, please cite with:

@inproceedings{semipose,
  title={An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation},
  author={Xie, Rongchang and Wang, Chunyu and Zeng, Wenjun and Wang, Yizhou},
  booktitle={ICCV},
  year={2021}
}

Acknowledgement

The code is mainly based on Simple Baseline and HRNet. Some code comes from DarkPose. Thanks for their works.

Owner
rongchangxie
Graduate student of Peking university
rongchangxie
Public repository of the 3DV 2021 paper "Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds"

Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds Björn Michele1), Alexandre Boulch1), Gilles Puy1), Maxime Bucher1) and Rena

valeo.ai 15 Dec 22, 2022
Doosan robotic arm, simulation, control, visualization in Gazebo and ROS2 for Reinforcement Learning.

Robotic Arm Simulation in ROS2 and Gazebo General Overview This repository includes: First, how to simulate a 6DoF Robotic Arm from scratch using GAZE

David Valencia 12 Jan 02, 2023
A pytorch-based deep learning framework for multi-modal 2D/3D medical image segmentation

A 3D multi-modal medical image segmentation library in PyTorch We strongly believe in open and reproducible deep learning research. Our goal is to imp

Adaloglou Nikolas 1.2k Dec 27, 2022
DeepRec is a recommendation engine based on TensorFlow.

DeepRec Introduction DeepRec is a recommendation engine based on TensorFlow 1.15, Intel-TensorFlow and NVIDIA-TensorFlow. Background Sparse model is a

Alibaba 676 Jan 03, 2023
One Million Scenes for Autonomous Driving

ONCE Benchmark This is a reproduced benchmark for 3D object detection on the ONCE (One Million Scenes) dataset. The code is mainly based on OpenPCDet.

148 Dec 28, 2022
Pytorch Implementation of Auto-Compressing Subset Pruning for Semantic Image Segmentation

Pytorch Implementation of Auto-Compressing Subset Pruning for Semantic Image Segmentation Introduction ACoSP is an online pruning algorithm that compr

Merantix 8 Dec 07, 2022
Official implementation of Pixel-Level Bijective Matching for Video Object Segmentation

BMVOS This is the official implementation of Pixel-Level Bijective Matching for Video Object Segmentation, to appear in WACV 2022. @article{cho2021pix

Suhwan Cho 13 Dec 14, 2022
SOLOv2 on onnx & tensorRT

SOLOv2.tensorRT: NOTE: code based on WXinlong/SOLO add support to TensorRT inference onnxruntime tensorRT full_dims and dynamic shape postprocess with

47 Nov 26, 2022
My implementation of Image Inpainting - A deep learning Inpainting model

Image Inpainting What is Image Inpainting Image inpainting is a restorative process that allows for the fixing or removal of unwanted parts within ima

Joshua V Evans 1 Dec 12, 2021
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

JAX: Autograd and XLA Quickstart | Transformations | Install guide | Neural net libraries | Change logs | Reference docs | Code search News: JAX tops

Google 21.3k Jan 01, 2023
A large-scale benchmark for co-optimizing the design and control of soft robots, as seen in NeurIPS 2021.

Evolution Gym A large-scale benchmark for co-optimizing the design and control of soft robots. As seen in Evolution Gym: A Large-Scale Benchmark for E

121 Dec 14, 2022
SIR model parameter estimation using a novel algorithm for differentiated uniformization.

TenSIR Parameter estimation on epidemic data under the SIR model using a novel algorithm for differentiated uniformization of Markov transition rate m

The Spang Lab 4 Nov 30, 2022
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

ONNX msg_chn_wacv20 depth completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20 model in

Ibai Gorordo 19 Oct 22, 2022
Interactive Image Generation via Generative Adversarial Networks

iGAN: Interactive Image Generation via Generative Adversarial Networks Project | Youtube | Paper Recent projects: [pix2pix]: Torch implementation for

Jun-Yan Zhu 3.9k Dec 23, 2022
Low Complexity Channel estimation with Neural Network Solutions

Interpolation-ResNet Invited paper for WSA 2021, called 'Low Complexity Channel estimation with Neural Network Solutions'. Low complexity residual con

Dianxin 10 Dec 10, 2022
Code of TIP2021 Paper《SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition》. We provide both MxNet and Pytorch versions.

SFace Code of TIP2021 Paper 《SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition》. We provide both MxNet, PyTorch and Jittor versi

Zhong Yaoyao 47 Nov 25, 2022
PyTorch implementation of our ICCV 2021 paper Intrinsic-Extrinsic Preserved GANs for Unsupervised 3D Pose Transfer.

Unsupervised_IEPGAN This is the PyTorch implementation of our ICCV 2021 paper Intrinsic-Extrinsic Preserved GANs for Unsupervised 3D Pose Transfer. Ha

25 Oct 26, 2022
Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021)

HAIS Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021) by Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang*. (*) Corresp

Hust Visual Learning Team 145 Jan 05, 2023
A system for quickly generating training data with weak supervision

Programmatically Build and Manage Training Data Announcement The Snorkel team is now focusing their efforts on Snorkel Flow, an end-to-end AI applicat

Snorkel Team 5.4k Jan 02, 2023
Library for 8-bit optimizers and quantization routines.

bitsandbytes Bitsandbytes is a lightweight wrapper around CUDA custom functions, in particular 8-bit optimizers and quantization functions. Paper -- V

Facebook Research 687 Jan 04, 2023