A PyTorch toolkit for 2D Human Pose Estimation.

Overview

PyTorch-Pose

screenshot

PyTorch-Pose is a PyTorch implementation of the general pipeline for 2D single human pose estimation. The aim is to provide the interface of the training/inference/evaluation, and the dataloader with various data augmentation options for the most popular human pose databases (e.g., the MPII human pose, LSP and FLIC).

Some codes for data preparation and augmentation are brought from the Stacked hourglass network. Thanks to the original author.

Update: this repository is compatible with PyTorch 0.4.1/1.0 now!

Features

  • Multi-thread data loading
  • Multi-GPU training
  • Logger
  • Training/testing results visualization

Installation

  1. PyTorch (>= 0.4.1): Please follow the installation instruction of PyTorch. Note that the code is developed with Python2 and has not been tested with Python3 yet.

  2. Clone the repository with submodule

    git clone --recursive https://github.com/bearpaw/pytorch-pose.git
    
  3. Create a symbolic link to the images directory of the MPII dataset:

    ln -s PATH_TO_MPII_IMAGES_DIR data/mpii/images
    

    For training/testing on COCO, please refer to COCO Readme.

  1. Download annotation file:

Usage

Please refer to TRAINING.md for detailed training recipes!

Testing

You may download our pretrained models (e.g., 2-stack hourglass model) for a quick start.

Run the following command in terminal to evaluate the model on MPII validation split (The train/val split is from Tompson et al. CVPR 2015).

CUDA_VISIBLE_DEVICES=0 python example/main.py --dataset mpii -a hg --stacks 2 --blocks 1 --checkpoint checkpoint/mpii/hg_s2_b1 --resume checkpoint/mpii/hg_s2_b1/model_best.pth.tar -e -d
  • -a specifies a network architecture
  • --resume will load the weight from a specific model
  • -e stands for evaluation only
  • -d will visualize the network output. It can be also used during training

The result will be saved as a .mat file (preds_valid.mat), which is a 2958x16x2 matrix, in the folder specified by --checkpoint.

Evaluate the [email protected] score

Evaluate with MATLAB

You may use the matlab script evaluation/eval_PCKh.m to evaluate your predictions. The evaluation code is ported from Tompson et al. CVPR 2015.

The results ([email protected] score) trained using this code is reported in the following table.

Model Head Shoulder Elbow Wrist Hip Knee Ankle Mean
hg_s2_b1 (last) 95.80 94.57 88.12 83.31 86.24 80.88 77.44 86.76
hg_s2_b1 (best) 95.87 94.68 88.27 83.64 86.29 81.20 77.70 86.95
hg_s8_b1 (last) 96.79 95.19 90.08 85.32 87.48 84.26 80.73 88.64
hg_s8_b1 (best) 96.79 95.28 90.27 85.56 87.57 84.3 81.06 88.78

Training / validation curve is visualized as follows.

curve

Evaluate with Python

You may also evaluate the result by running python evaluation/eval_PCKh.py to evaluate the predictions. It will produce exactly the same result as that of the MATLAB. Thanks @sssruhan1 for the contribution.

Training

Run the following command in terminal to train an 8-stack of hourglass network on the MPII human pose dataset.

CUDA_VISIBLE_DEVICES=0 python example/main.py --dataset mpii -a hg --stacks 8 --blocks 1 --checkpoint checkpoint/mpii/hg8 -j 4

Here,

  • CUDA_VISIBLE_DEVICES=0 identifies the GPU devices you want to use. For example, use CUDA_VISIBLE_DEVICES=0,1 if you want to use two GPUs with ID 0 and 1.
  • -j specifies how many workers you want to use for data loading.
  • --checkpoint specifies where you want to save the models, the log and the predictions to.

Miscs

Supported dataset

Supported models

Contribute

Please create a pull request if you want to contribute.

Owner
Wei Yang
NVIDIA Robotics Research Lab
Wei Yang
Subdivision-based Mesh Convolutional Networks

Subdivision-based Mesh Convolutional Networks The official implementation of SubdivNet in our paper, Subdivion-based Mesh Convolutional Networks Requi

Zheng-Ning Liu 181 Dec 28, 2022
LSTM model trained on a small dataset of 3000 names written in PyTorch

LSTM model trained on a small dataset of 3000 names. Model generates names from model by selecting one out of top 3 letters suggested by model at a time until an EOS (End Of Sentence) character is no

Sahil Lamba 1 Dec 20, 2021
3D dataset of humans Manipulating Objects in-the-Wild (MOW)

MOW dataset [Website] This repository maintains our 3D dataset of humans Manipulating Objects in-the-Wild (MOW). The dataset contains 512 images in th

Zhe Cao 28 Nov 06, 2022
Tensorflow AffordanceNet and AffContext implementations

AffordanceNet and AffContext This is tensorflow AffordanceNet and AffContext implementations. Both are implemented and tested with tensorflow 2.3. The

Beatriz Pérez 6 Dec 01, 2022
SuRE Evaluation: A Supplementary Material

SuRE Evaluation: A Supplementary Material This repository contains supplementary material regarding the evaluations presented in the paper Visual Expl

NYU Visualization Lab 0 Dec 14, 2021
Artificial Neural network regression model to predict the energy output in a combined cycle power plant.

Energy_Output_Predictor Artificial Neural network regression model to predict the energy output in a combined cycle power plant. Abstract Energy outpu

1 Feb 11, 2022
Official PyTorch implementation of the paper "TEMOS: Generating diverse human motions from textual descriptions"

TEMOS: TExt to MOtionS Generating diverse human motions from textual descriptions Description Official PyTorch implementation of the paper "TEMOS: Gen

Mathis Petrovich 187 Dec 27, 2022
Official code for article "Expression is enough: Improving traffic signal control with advanced traffic state representation"

1 Introduction Official code for article "Expression is enough: Improving traffic signal control with advanced traffic state representation". The code s

Liang Zhang 10 Dec 10, 2022
Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

Introduction Code and data for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning". We cons

Pan Lu 81 Dec 27, 2022
Unsupervised Learning of Video Representations using LSTMs

Unsupervised Learning of Video Representations using LSTMs Code for paper Unsupervised Learning of Video Representations using LSTMs by Nitish Srivast

Elman Mansimov 341 Dec 20, 2022
Joint Gaussian Graphical Model Estimation: A Survey

Joint Gaussian Graphical Model Estimation: A Survey Test Models Fused graphical lasso [1] Group graphical lasso [1] Graphical lasso [1] Doubly joint s

Koyejo Lab 1 Aug 10, 2022
Exploring Simple Siamese Representation Learning

G-SimSiam A PyTorch implementation which refers to repo for the paper Exploring Simple Siamese Representation Learning by Xinlei Chen & Kaiming He Add

zhuyun 1 Dec 19, 2021
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation, CVPR2022

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation Paper Links: TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentati

Hust Visual Learning Team 253 Dec 21, 2022
2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案

2020CCF-NER 2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案 bert base + flat + crf + fgm + swa + pu learning策略 + clue数据集 = test1单模0.906 词向量

67 Oct 19, 2022
Keyword2Text This repository contains the code of the paper: "A Plug-and-Play Method for Controlled Text Generation"

Keyword2Text This repository contains the code of the paper: "A Plug-and-Play Method for Controlled Text Generation", if you find this useful and use

57 Dec 27, 2022
TransPrompt - Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification

TransPrompt This code is implement for our EMNLP 2021's paper 《TransPrompt:Towards an Automatic Transferable Prompting Framework for Few-shot Text Cla

WangJianing 23 Dec 21, 2022
Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021

DIFFNet This repo is for Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021 A new backbone for self-supervised de

Hang 94 Dec 25, 2022
Repository of Vision Transformer with Deformable Attention

Vision Transformer with Deformable Attention This repository contains the code for the paper Vision Transformer with Deformable Attention [arXiv]. Int

410 Jan 03, 2023
A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

Biomedical Computer Vision @ Uniandes 52 Dec 19, 2022
a spacial-temporal pattern detection system for home automation

Argos a spacial-temporal pattern detection system for home automation. Based on OpenCV and Tensorflow, can run on raspberry pi and notify HomeAssistan

Angad Singh 133 Jan 05, 2023