Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

Last update: Dec 18, 2022

Overview

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting (official Pytorch implementation)

This paper submitted to TIP is the extension of the previous Arxiv paper.

This project aims to

provide a baseline of pedestrian attribute recognition.
provide two new datasets RAPzs and PETAzs following zero-shot pedestrian identity setting.
provide a general training pipeline for pedestrian attribute recognition and multi-label classification task.

This project provide

DDP training, which is mainly used for multi-label classifition.
Training on all attributes, testing on "selected" attribute. Because the proportion of positive samples for other attributes is less than a threshold, such as 0.01.
1. For PETA and PETAzs, 35 of the 105 attributes are selected for performance evaluation.
2. For RAPv1, 51 of the 92 attributes are selected for performance evaluation.
3. For RAPv2 and RAPzs, 54 and 53 of the 152 attributes are selected for performance evaluation.
4. For PA100k, all attributes are selected for performance evaluation.
- However, training on all attributes can not bring consistent performance improvement on various datasets.
EMA model.
Transformer-base model, such as swin-transformer (with a huge performance improvement) and vit.
Convenient dataset info file like dataset_all.pkl

Dataset Info

PETA: Pedestrian Attribute Recognition At Far Distance [Paper][Project]
PA100K[Paper][Github]
RAP : A Richly Annotated Dataset for Pedestrian Attribute Recognition
- v1 [Paper][Project]
- v2 [Paper][Project]
PETAzs & RAPzs : Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting Paper [Project]

Performance

Pedestrian Attribute Recognition

Datasets	Models	ma	Acc	Prec	Rec	F1
PA100k	resnet50	80.21	79.15	87.79	87.01	87.40
--	resnet50*	79.85	79.13	89.45	85.40	87.38
--	resnet50 + EMA	81.97	80.20	88.06	88.17	88.11
--	bninception	79.13	78.19	87.42	86.21	86.81
--	TresnetM	74.46	68.72	79.82	80.71	80.26
--	swin_s	82.19	80.35	87.85	88.51	88.18
--	vit_s	79.40	77.61	86.41	86.22	86.32
--	vit_b	81.01	79.38	87.60	87.49	87.55
PETA	resnet50	83.96	78.65	87.08	85.62	86.35
PETAzs	resnet50	71.43	58.69	74.41	69.82	72.04
RAPv1	resnet50	79.27	67.98	80.19	79.71	79.95
RAPv2	resnet50	78.52	66.09	77.20	80.23	78.68
RAPzs	resnet50	71.76	64.83	78.75	76.60	77.66

The resnet* model is trained by using the weighted function proposed by Tan in AAAI2020.
Performance in PETAzs and RAPzs based on the first version of PETAzs and RAPzs as described in paper.
Experiments are conducted on the input size of (256, 192), so there may be minor differences from the results in the paper.
The reported performance can be achieved at the first drop of learning rate. We also take this model as the best model.
Pretrained models are provided now at Google Drive.

Multi-label Classification

Datasets	Models	mAP	CP	CR	CF1	OP	OR	OF1
COCO	resnet101	82.75	84.17	72.07	77.65	85.16	75.47	80.02

Pretrained Models

Dependencies

python 3.7
pytorch 1.7.0
torchvision 0.8.2
cuda 10.1

Get Started

Run git clone https://github.com/valencebond/Rethinking_of_PAR.git
Create a directory to dowload above datasets.
```
cd Rethinking_of_PAR
mkdir data
```

Prepare datasets to have following structure:

${project_dir}/data
    PETA
        images/
        PETA.mat
        dataset_all.pkl
        dataset_zs_run0.pkl
    PA100k
        data/
        dataset_all.pkl
    RAP
        RAP_dataset/
        RAP_annotation/
        dataset_all.pkl
    RAP2
        RAP_dataset/
        RAP_annotation/
        dataset_zs_run0.pkl
    COCO14
        train2014/
        val2014/
        ml_anno/
            category.json
            coco14_train_anno.pkl
            coco14_val_anno.pkl

Train baseline based on resnet50
```
sh train.sh
```

Acknowledgements

Codes are based on the repository from Dangwei Li and Houjing Huang. Thanks for their released code.

Citation

If you use this method or this code in your research, please cite as:

@article{jia2021rethinking,
  title={Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting},
  author={Jia, Jian and Huang, Houjing and Chen, Xiaotang and Huang, Kaiqi},
  journal={arXiv preprint arXiv:2107.03576},
  year={2021}
}

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

Related tags

Overview

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting (official Pytorch implementation)

This project aims to

This project provide

Dataset Info

Performance

Pedestrian Attribute Recognition

Multi-label Classification

Pretrained Models

Dependencies

Get Started

Acknowledgements

Citation

Owner

Jian

Classifying cat and dog images using Kaggle dataset

MLP-Numpy - A simple modular implementation of Multi Layer Perceptron in pure Numpy.

Code for our EMNLP 2021 paper "Learning Kernel-Smoothed Machine Translation with Retrieved Examples"

Hand-distance-measurement-game - Hand Distance Measurement Game

Minimalist Error collection Service compatible with Rollbar clients. Sentry or Rollbar alternative.

Official code for: A Probabilistic Hard Attention Model For Sequentially Observed Scenes

Learning infinite-resolution image processing with GAN and RL from unpaired image datasets, using a differentiable photo editing model.

Official Pytorch implementation of Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference (ICLR 2022)

Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.

A tool for calculating distortion parameters in coordination complexes.

How to Leverage Multimodal EHR Data for Better Medical Predictions?

PyTorch Connectomics: segmentation toolbox for EM connectomics

Pytorch implementation of "Training a 85.4% Top-1 Accuracy Vision Transformer with 56M Parameters on ImageNet"

Jiminy Cricket Environment (NeurIPS 2021)

Efficiently Disentangle Causal Representations

Training, generation, and analysis code for Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics

State-of-the-art language models can match human performance on many tasks

Naszilla is a Python library for neural architecture search (NAS)

Gauge equivariant mesh cnn