Code repository for Self-supervised Structure-sensitive Learning, CVPR'17

Last update: Dec 29, 2022

Related tags

Overview

Self-supervised Structure-sensitive Learning (SSL)

Ke Gong, Xiaodan Liang, Xiaohui Shen, Liang Lin, "Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing", CVPR 2017.

Introduction

SSL is a state-of-the-art deep learning methord for human parsing built on top of Caffe. This novel self-supervised structure-sensitive learning approach can impose human pose structures into parsing results without resorting to extra supervision (i.e., no need for specifically labeling human joints in model training). The self-supervised learning framework can be injected into any advanced neural networks to help incorporate rich high-level knowledge regarding human joints from a global perspective and improve the parsing results.

This distribution provides a publicly available implementation for the key model ingredients reported in our latest paper which is accepted by CVPR2017.

We newly introduce a novel Joint Human Parsing and Pose Estimation Network (JPPNet), which is accepted by T-PAMI 2018. (Paper and Code)

Please consult and consider citing the following papers:

@InProceedings{Gong_2017_CVPR,
  author = {Gong, Ke and Liang, Xiaodan and Zhang, Dongyu and Shen, Xiaohui and Lin, Liang},
  title = {Look Into Person: Self-Supervised Structure-Sensitive Learning and a New Benchmark for Human Parsing},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {July},
  year = {2017}
}
@article{liang2018look,
  title={Look into Person: Joint Body Parsing \& Pose Estimation Network and a New Benchmark},
  author={Liang, Xiaodan and Gong, Ke and Shen, Xiaohui and Lin, Liang},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2018},
  publisher={IEEE}
}

Look into People (LIP) Dataset

The SSL is trained and evaluated on our LIP dataset for human parsing. Please check it for more model details. The dataset is also available at google drive and baidu drive.

Pre-trained models

We have released our trained models with best performance at google drive and baidu drive.

Train and test

Download LIP dataset or prepare your own data.
Put the images(.jpg) and segmentations(.png) into ssl/human/data/images and ssl/human/data/labels
Put the train, val, test lists into ssl/human/list. Each type contains a list for path and a list for id (e.g., train.txt and train_id.txt)
Download the pre-trained model and put it into ssl/human/model/attention/. You can also refer DeepLab for more models.
Set up your init.caffemodel before training and test.caffemodel before testing. You can simply use a soft link.
The prototxt files for network config are saved in ssl/human/config
In run_human.sh, you can set the value of RUN_TRAIN adn RUN_TEST to train or test the model.
After you run TEST, the computed features will be saved in ssl/human/features. You can run the provided MATLAB script, show.m to generate visualizable results. Then you can run the Python script, test_human.py to evaluate the performance.

Related work

Joint Body Parsing & Pose Estimation Network JPPNet， T-PAMI2018
Instance-level Human Parsing via Part Grouping Network PGN, ECCV2018
Graphonomy: Universal Human Parsing via Graph Transfer Learning Graphonomy, CVPR2019

Code repository for Self-supervised Structure-sensitive Learning, CVPR'17

Related tags

Overview

Self-supervised Structure-sensitive Learning (SSL)

Introduction

Look into People (LIP) Dataset

Pre-trained models

Train and test

Related work

Owner

Clay Gong

Autolfads-tf2 - A TensorFlow 2.0 implementation of Latent Factor Analysis via Dynamical Systems (LFADS) and AutoLFADS

How Do Adam and Training Strategies Help BNNs Optimization? In ICML 2021.

Official code for UnICORNN (ICML 2021)

Implementation of CVAE. Trained CVAE on faces from UTKFace Dataset to produce synthetic faces with a given degree of happiness/smileyness.

The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

Code for "Adversarial Attack Generation Empowered by Min-Max Optimization", NeurIPS 2021

Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language

Python Jupyter kernel using Poetry for reproducible notebooks

This program uses trial auth token of Azure Cognitive Services to do speech synthesis for you.

GenGNN: A Generic FPGA Framework for Graph Neural Network Acceleration

Universal Probability Distributions with Optimal Transport and Convex Optimization

PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

BigbrotherBENL - Face recognition on the Big Brother episodes in Belgium and the Netherlands.

The Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

The official PyTorch code for NeurIPS 2021 ML4AD Paper, "Does Thermal data make the detection systems more reliable?"

PyTorch implementation of paper "IBRNet: Learning Multi-View Image-Based Rendering", CVPR 2021.

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Pytorch implementation of paper: "NeurMiPs: Neural Mixture of Planar Experts for View Synthesis"

Official code for paper "Optimization for Oriented Object Detection via Representation Invariance Loss".