The official codes for the ICCV2021 Oral presentation "Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework"

Last update: Dec 26, 2022

Related tags

Deep Learning CrowdCounting-P2PNet

Overview

P2PNet (ICCV2021 Oral Presentation)

This repository contains codes for the official implementation in PyTorch of P2PNet as described in Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework.

An brief introduction of P2PNet can be found at 机器之心 (almosthuman).

The codes is tested with PyTorch 1.5.0. It may not run with other versions.

Visualized demos for P2PNet

The network

The overall architecture of the P2PNet. Built upon the VGG16, it firstly introduce an upsampling path to obtain fine-grained feature map. Then it exploits two branches to simultaneously predict a set of point proposals and their confidence scores.

Comparison with state-of-the-art methods

The P2PNet achieved state-of-the-art performance on several challenging datasets with various densities.

Methods	Venue	SHTechPartA MAE/MSE	SHTechPartB MAE/MSE	UCF_CC_50 MAE/MSE	UCF_QNRF MAE/MSE
CAN	CVPR'19	62.3/100.0	7.8/12.2	212.2/243.7	107.0/183.0
Bayesian+	ICCV'19	62.8/101.8	7.7/12.7	229.3/308.2	88.7/154.8
S-DCNet	ICCV'19	58.3/95.0	6.7/10.7	204.2/301.3	104.4/176.1
SANet+SPANet	ICCV'19	59.4/92.5	6.5/9.9	232.6/311.7	-/-
DUBNet	AAAI'20	64.6/106.8	7.7/12.5	243.8/329.3	105.6/180.5
SDANet	AAAI'20	63.6/101.8	7.8/10.2	227.6/316.4	-/-
ADSCNet	CVPR'20	55.4/97.7	6.4/11.3	198.4/267.3	71.3/132.5
ASNet	CVPR'20	57.78/90.13	-/-	174.84/251.63	91.59/159.71
AMRNet	ECCV'20	61.59/98.36	7.02/11.00	184.0/265.8	86.6/152.2
AMSNet	ECCV'20	56.7/93.4	6.7/10.2	208.4/297.3	101.8/163.2
DM-Count	NeurIPS'20	59.7/95.7	7.4/11.8	211.0/291.5	85.6/148.3
Ours	-	52.74/85.06	6.25/9.9	172.72/256.18	85.32/154.5

Comparison on the NWPU-Crowd dataset.

Methods	MAE[O]	MSE[O]	MAE[L]	MAE[S]
MCNN	232.5	714.6	220.9	1171.9
SANet	190.6	491.4	153.8	716.3
CSRNet	121.3	387.8	112.0	522.7
PCC-Net	112.3	457.0	111.0	777.6
CANNet	110.0	495.3	102.3	718.3
Bayesian+	105.4	454.2	115.8	750.5
S-DCNet	90.2	370.5	82.9	567.8
DM-Count	88.4	388.6	88.0	498.0
Ours	77.44	362	83.28	553.92

The overall performance for both counting and localization.

nAP$_{\delta}$	SHTechPartA	SHTechPartB	UCF_CC_50	UCF_QNRF	NWPU_Crowd
$\delta=0.05$	10.9%	23.8%	5.0%	5.9%	12.9%
$\delta=0.25$	70.3%	84.2%	54.5%	55.4%	71.3%
$\delta=0.50$	90.1%	94.1%	88.1%	83.2%	89.1%
$\delta={{0.05:0.05:0.50}}$	64.4%	76.3%	54.3%	53.1%	65.0%

Comparison for the localization performance in terms of F1-Measure on NWPU.

Method	F1-Measure	Precision	Recall
FasterRCNN	0.068	0.958	0.035
TinyFaces	0.567	0.529	0.611
RAZ	0.599	0.666	0.543
Crowd-SDNet	0.637	0.651	0.624
PDRNet	0.653	0.675	0.633
TopoCount	0.692	0.683	0.701
D2CNet	0.700	0.741	0.662
Ours	0.712	0.729	0.695

Installation

Clone this repo into a directory named P2PNET_ROOT
Organize your datasets as required
Install Python dependencies. We use python 3.6.5 and pytorch 1.5.0

pip install -r requirements.txt

Organize the counting dataset

We use a list file to collect all the images and their ground truth annotations in a counting dataset. When your dataset is organized as recommended in the following, the format of this list file is defined as:

train/scene01/img01.jpg train/scene01/img01.txt
train/scene01/img02.jpg train/scene01/img02.txt
...
train/scene02/img01.jpg train/scene02/img01.txt

Dataset structures:

DATA_ROOT/
        |->train/
        |    |->scene01/
        |    |->scene02/
        |    |->...
        |->test/
        |    |->scene01/
        |    |->scene02/
        |    |->...
        |->train.list
        |->test.list

DATA_ROOT is your path containing the counting datasets.

Annotations format

For the annotations of each image, we use a single txt file which contains one annotation per line. Note that indexing for pixel values starts at 0. The expected format of each line is:

x1 y1
x2 y2
...

Training

The network can be trained using the train.py script. For training on SHTechPartA, use

CUDA_VISIBLE_DEVICES=0 python train.py --data_root $DATA_ROOT \
    --dataset_file SHHA \
    --epochs 3500 \
    --lr_drop 3500 \
    --output_dir ./logs \
    --checkpoints_dir ./weights \
    --tensorboard_dir ./logs \
    --lr 0.0001 \
    --lr_backbone 0.00001 \
    --batch_size 8 \
    --eval_freq 1 \
    --gpu_id 0

By default, a periodic evaluation will be conducted on the validation set.

Testing

A trained model (with an MAE of 51.96) on SHTechPartA is available at "./weights", run the following commands to launch a visualization demo:

CUDA_VISIBLE_DEVICES=0 python run_test.py --weight_path ./weights/SHTechA.pth --output_dir ./logs/

Acknowledgements

Part of codes are borrowed from the C^3 Framework.
We refer to DETR to implement our matching strategy.

Citing P2PNet

If you find P2PNet is useful in your project, please consider citing us:

@inproceedings{song2021rethinking,
  title={Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework},
  author={Song, Qingyu and Wang, Changan and Jiang, Zhengkai and Wang, Yabiao and Tai, Ying and Wang, Chengjie and Li, Jilin and Huang, Feiyue and Wu, Yang},
  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2021}
}

Related works from Tencent Youtu Lab

[AAAI2021] To Choose or to Fuse? Scale Selection for Crowd Counting. (paper link & codes)
[ICCV2021] Uniformity in Heterogeneity: Diving Deep into Count Interval Partition for Crowd Counting. (paper link & codes)

The official codes for the ICCV2021 Oral presentation "Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework"

Related tags

Overview

P2PNet (ICCV2021 Oral Presentation)

Visualized demos for P2PNet

The network

Comparison with state-of-the-art methods

Installation

Organize the counting dataset

Dataset structures:

Annotations format

Training

Testing

Acknowledgements

Citing P2PNet

Related works from Tencent Youtu Lab

Owner

Tencent YouTu Research

✅ How Robust are Fact Checking Systems on Colloquial Claims?. In NAACL-HLT, 2021.

PConv-Keras - Unofficial implementation of "Image Inpainting for Irregular Holes Using Partial Convolutions". Try at: www.fixmyphoto.ai

Wordplay, an artificial Intelligence based crossword puzzle solver.

Brain tumor detection using Convolution-Neural Network (CNN)

Pytorch implementation of the popular Improv RNN model originally proposed by the Magenta team.

REGTR: End-to-end Point Cloud Correspondences with Transformers

Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)

这个开源项目主要是对经典的时间序列预测算法论文进行复现，模型主要参考自GluonTS，框架主要参考自Informer

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

git《Tangent Space Backpropogation for 3D Transformation Groups》(CVPR 2021) GitHub:1]

Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

particle tracking model, works with the ROMS output file(qck.nc, his.nc)

ATAC: Adversarially Trained Actor Critic

Unofficial implement with paper SpeakerGAN: Speaker identification with conditional generative adversarial network

Spatial color quantization in Rust

Rasterize with the least efforts for researchers.

Repository for GNSS-based position estimation using a Deep Neural Network

Code to reproduce the experiments from our NeurIPS 2021 paper " The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective"

PanopticBEV - Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images

A quantum game modeling of pandemic (QHack 2022)