Fashion Landmark Estimation with HRNet

Last update: Dec 26, 2022

Related tags

Overview

HRNet for Fashion Landmark Estimation

(Modified from deep-high-resolution-net.pytorch)

Introduction

This code applies the HRNet (Deep High-Resolution Representation Learning for Human Pose Estimation) onto fashion landmark estimation task using the DeepFashion2 dataset. HRNet maintains high-resolution representations throughout the forward path. As a result, the predicted keypoint heatmap is potentially more accurate and spatially more precise.

Please note that every image in DeepFashion2 contains multiple fashion items, while our model assumes that there exists only one item in each image. Therefore, what we feed into the HRNet is not the original image but the cropped ones provided by a detector. In experiments, one can either use the ground truth bounding box annotation to generate the input data or use the output of a detecter.

Main Results

Landmark Estimation Performance on DeepFashion2 Test set

We won the third place in the "DeepFashion2 Challenge 2020 - Track 1 Clothes Landmark Estimation" competition.

Landmark Estimation Performance on DeepFashion2 Validation Set

Arch	BBox Source	AP	Ap .5	AP .75	AP (M)	AP (L)	AR	AR .5	AR .75	AR (M)	AR (L)
pose_hrnet	Detector	0.579	0.793	0.658	0.460	0.581	0.706	0.939	0.784	0.548	0.708
pose_hrnet	GT	0.702	0.956	0.801	0.579	0.703	0.740	0.965	0.827	0.592	0.741

Quick start

Installation

Install pytorch >= v1.2 following official instruction. Note that if you use pytorch's version < v1.0.0, you should follow the instruction at https://github.com/Microsoft/human-pose-estimation.pytorch to disable cudnn's implementations of BatchNorm layer. We encourage you to use higher pytorch's version(>=v1.0.0)
Clone this repo, and we'll call the directory that you cloned as ${POSE_ROOT}.
Install dependencies:
```
pip install -r requirements.txt
```
Make libs:
```
cd ${POSE_ROOT}/lib
make
```

Init output(training model output directory) and log(tensorboard log directory) directory:

mkdir output 
mkdir log

Your directory tree should look like this:

${POSE_ROOT}
|-- lib
|-- tools 
|-- experiments
|-- models
|-- data
|-- log
|-- output
|-- README.md
`-- requirements.txt

Download pretrained models from our Onedrive Cloud Storage

Data preparation

Our experiments were conducted on DeepFashion2, clone this repo, and we'll call the directory that you cloned as ${DF2_ROOT}.

1) Download the dataset

Extract the dataset under ${POSE_ROOT}/data.

2) Convert annotations into coco-type

The above code repo provides a script to convert annotations into coco-type.

We uploaded our converted annotation file onto OneDrive named as train/val-coco_style.json. We also made truncated json files such as train-coco_style-32.json meaning the first 32 samples in the dataset to save the loading time during development period.

3) Install the deepfashion_api

Enter ${DF2_ROOT}/deepfashion2_api/PythonAPI and run

python setup.py install

Note that the deepfashion2_api is modified from the cocoapi without changing the package name. Therefore, conflicts occur if you try to install this package when you have installed the original cocoapi in your computer. We provide two feasible solutions: 1) run our code in a virtualenv 2) use the deepfashion2_api as a local pacakge. Also note that deepfashion2_api is different with cocoapi mainly in the number of classes and the values of standard variations for keypoints.

At last the directory should look like this:

${POSE_ROOT}
|-- data
`-- |-- deepfashion2
    `-- |-- train
        |   |-- image
        |   |-- annos                           (raw annotation)
        |   |-- train-coco_style.json           (converted annotation file)
        |   `-- train-coco_style-32.json      (truncated for fast debugging)
        |-- validation
        |   |-- image
        |   |-- annos                           (raw annotation)
        |   |-- val-coco_style.json             (converted annotation file)
        |   `-- val-coco_style-64.json        (truncated for fast debugging)
        `-- json_for_test
            `-- keypoints_test_information.json

Training and Testing

Note that the GPUS parameter in the yaml config file is deprecated. To select GPUs, use the environment varaible:

 export CUDA_VISIBLE_DEVICES=1

Testing on DeepFashion2 dataset with BBox from ground truth using trained models:

python tools/test.py \
    --cfg experiments/deepfashion2/hrnet/w48_384x288_adam_lr1e-3.yaml \
    TEST.MODEL_FILE models/pose_hrnet-w48_384x288-deepfashion2_mAP_0.7017.pth \
    TEST.USE_GT_BBOX True

Testing on DeepFashion2 dataset with BBox from a detector using trained models:

python tools/test.py \
    --cfg experiments/deepfashion2/hrnet/w48_384x288_adam_lr1e-3.yaml \
    TEST.MODEL_FILE models/pose_hrnet-w48_384x288-deepfashion2_mAP_0.7017.pth \
    TEST.DEEPFASHION2_BBOX_FILE data/bbox_result_val.pkl \

Training on DeepFashion2 dataset using pretrained models:

python tools/train.py \
    --cfg experiments/deepfashion2/hrnet/w48_384x288_adam_lr1e-3.yaml \
     MODEL.PRETRAINED models/pose_hrnet-w48_384x288-deepfashion2_mAP_0.7017.pth

Other options

python tools/test.py \
    ... \
    DATASET.MINI_DATASET True \ # use a subset of the annotation to save loading time
    TAG 'experiment description' \ # this info will appear in the output directory name
    WORKERS 4 \ # num_of_worker for the dataloader
    TEST.BATCH_SIZE_PER_GPU 8 \
    TRAIN.BATCH_SIZE_PER_GPU 8 \

OneDrive Cloud Storage

OneDrive

We provide the following files:

Model checkpoint files
Converted annotation files in coco-type
Bounding box results from our self-implemented detector in a pickle file.

hrnet-for-fashion-landmark-estimation.pytorch
|-- models
|   `-- pose_hrnet-w48_384x288-deepfashion2_mAP_0.7017.pth
|
|-- data
|   |-- bbox_result_val.pkl
|   |
`-- |-- deepfashion2
    `---|-- train
        |   |-- train-coco_style.json           (converted annotation file)
        |   `-- train-coco_style-32.json      (truncated for fast debugging)
        `-- validation
            |-- val-coco_style.json             (converted annotation file)
            `-- val-coco_style-64.json        (truncated for fast debugging)

Discussion

Experiment Configuration

For the regression target of keypoint heatmaps, we tuned the standard deviation value sigma and finally set it to 2.
During training, we found that the data augmentation from the original code was too intensive which makes the training process unstable. We weakened the augmentation parameters and observed performance gain.
Due to the imbalance of classes in DeepFashion2 dataset, the model's performance on different classes varies a lot. Therefore, we adopted a weighted sampling strategy rather than the naive random shuffling strategy, and observed performance gain.
We expermented with the value of weight decay, and found that either 1e-4 or 1e-5 harms the performance. Therefore, we simply set weight decay to 0.

Fashion Landmark Estimation with HRNet

Related tags

Overview

HRNet for Fashion Landmark Estimation

Introduction

Main Results

Landmark Estimation Performance on DeepFashion2 Test set

Landmark Estimation Performance on DeepFashion2 Validation Set

Quick start

Installation

Data preparation

1) Download the dataset

2) Convert annotations into coco-type

3) Install the deepfashion_api

Training and Testing

OneDrive Cloud Storage

Discussion

Experiment Configuration

Owner

SVIP Lab

This repository contains implementations and illustrative code to accompany DeepMind publications

Code Repository for The Kaggle Book, Published by Packt Publishing

A deep-learning pipeline for segmentation of ambiguous microscopic images.

Official Pytorch implementation for 2021 ICCV paper "Learning Motion Priors for 4D Human Body Capture in 3D Scenes" and trained models / data

Object Detection Projekt in GKI WS2021/22

A public available dataset for road boundary detection in aerial images

Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

TUPÃ was developed to analyze electric field properties in molecular simulations

A fast, dataset-agnostic, deep visual search engine for digital art history

Project to create an open-source 6 DoF input device

[NeurIPS'21] "AugMax: Adversarial Composition of Random Augmentations for Robust Training" by Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Animashree Anandkumar, and Zhangyang Wang.

Fuzzer for Linux Kernel Drivers

Sample and Computation Redistribution for Efficient Face Detection

Tello Drone Trajectory Tracking

Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation"

Tensorflow port of a full NetVLAD network

small collection of functions for neural networks

1st place solution in CCF BDCI 2021 ULSEG challenge

EfficientDet (Scalable and Efficient Object Detection) implementation in Keras and Tensorflow

Sentiment analysis translations of the Bhagavad Gita