PSENet - Shape Robust Text Detection with Progressive Scale Expansion Network.

Last update: Dec 24, 2022

Related tags

Overview

News

Python3 implementations of PSENet [1], PAN [2] and PAN++ [3] are released at https://github.com/whai362/pan_pp.pytorch.

[1] W. Wang, E. Xie, X. Li, W. Hou, T. Lu, G. Yu, and S. Shao. Shape robust text detection with progressive scale expansion network. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., pages 9336–9345, 2019.
[2] W. Wang, E. Xie, X. Song, Y. Zang, W. Wang, T. Lu, G. Yu, and C. Shen. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In Proc. IEEE Int. Conf. Comp. Vis., pages 8440–8449, 2019.
[3] Paper is in preparation.

Shape Robust Text Detection with Progressive Scale Expansion Network

Requirements

Python 2.7
PyTorch v0.4.1+
pyclipper
Polygon2
OpenCV 3.4 (for c++ version pse)
opencv-python 3.4

Introduction

Progressive Scale Expansion Network (PSENet) is a text detector which is able to well detect the arbitrary-shape text in natural scene.

Training

CUDA_VISIBLE_DEVICES=0,1,2,3 python train_ic15.py

Testing

CUDA_VISIBLE_DEVICES=0 python test_ic15.py --scale 1 --resume [path of model]

Eval script for ICDAR 2015 and SCUT-CTW1500

cd eval
sh eval_ic15.sh
sh eval_ctw1500.sh

Performance (new version paper)

ICDAR 2015

Method	Extra Data	Precision (%)	Recall (%)	F-measure (%)	FPS (1080Ti)	Model
PSENet-1s (ResNet50)	-	81.49	79.68	80.57	1.6	baiduyun(extract code: rxti); OneDrive
PSENet-1s (ResNet50)	pretrain on IC17 MLT	86.92	84.5	85.69	1.6	baiduyun(extract code: aieo); OneDrive
PSENet-4s (ResNet50)	pretrain on IC17 MLT	86.1	83.77	84.92	3.8	baiduyun(extract code: aieo); OneDrive

SCUT-CTW1500

Method	Extra Data	Precision (%)	Recall (%)	F-measure (%)	FPS (1080Ti)	Model
PSENet-1s (ResNet50)	-	80.57	75.55	78.0	3.9	baiduyun(extract code: ksv7); OneDrive
PSENet-1s (ResNet50)	pretrain on IC17 MLT	84.84	79.73	82.2	3.9	baiduyun(extract code: z7ac); OneDrive
PSENet-4s (ResNet50)	pretrain on IC17 MLT	82.09	77.84	79.9	8.4	baiduyun(extract code: z7ac); OneDrive

Performance (old version paper)

ICDAR 2015 (training with ICDAR 2017 MLT)

Method	Precision (%)	Recall (%)	F-measure (%)
PSENet-4s (ResNet152)	87.98	83.87	85.88
PSENet-2s (ResNet152)	89.30	85.22	87.21
PSENet-1s (ResNet152)	88.71	85.51	87.08

ICDAR 2017 MLT

Method	Precision (%)	Recall (%)	F-measure (%)
PSENet-4s (ResNet152)	75.98	67.56	71.52
PSENet-2s (ResNet152)	76.97	68.35	72.40
PSENet-1s (ResNet152)	77.01	68.40	72.45

SCUT-CTW1500

Method	Precision (%)	Recall (%)	F-measure (%)
PSENet-4s (ResNet152)	80.49	78.13	79.29
PSENet-2s (ResNet152)	81.95	79.30	80.60
PSENet-1s (ResNet152)	82.50	79.89	81.17

ICPR MTWI 2018 Challenge 2

Method	Precision (%)	Recall (%)	F-measure (%)
PSENet-1s (ResNet152)	78.5	72.1	75.2

Results

Figure 3: The results on ICDAR 2015, ICDAR 2017 MLT and SCUT-CTW1500

Paper Link

[new version paper] https://arxiv.org/abs/1903.12473

[old version paper] https://arxiv.org/abs/1806.02559

Other Implements

[tensorflow version (thanks @liuheng92)] https://github.com/liuheng92/tensorflow_PSENet

Citation

@inproceedings{wang2019shape,
  title={Shape Robust Text Detection With Progressive Scale Expansion Network},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9336--9345},
  year={2019}
}

PSENet - Shape Robust Text Detection with Progressive Scale Expansion Network.

Related tags

Overview

News

Shape Robust Text Detection with Progressive Scale Expansion Network

Requirements

Introduction

Training

Testing

Eval script for ICDAR 2015 and SCUT-CTW1500

Performance (new version paper)

ICDAR 2015

SCUT-CTW1500

Performance (old version paper)

ICDAR 2015 (training with ICDAR 2017 MLT)

ICDAR 2017 MLT

SCUT-CTW1500

ICPR MTWI 2018 Challenge 2

Results

Paper Link

Other Implements

Citation

Owner

CellProfiler is a open-source application for biological image analysis

M-LSDを用いて四角形を検出し、射影変換を行うサンプルプログラム

Augmenting Anchors by the Detector Itself

InverseRenderNet: Learning single image inverse rendering, CVPR 2019.

LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

This is the open source implementation of the ICLR2022 paper "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis"

第一届西安交通大学人工智能实践大赛（2018AI实践大赛--图片文字识别）第一名；仅采用densenet识别图中文字

Perspective recovery of text using transformed ellipses

Python Computer Vision application that allows users to draw/erase on the screen using their webcam.

[EMNLP 2021] Improving and Simplifying Pattern Exploiting Training

The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

A post-processing tool for scanned sheets of paper.

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Table recognition inside douments using neural networks

Create single line SVG illustrations from your pictures

This is a project to detect gestures to zoom in or out, using the real-time distance between the index finger and the thumb. It's based on OpenCV and Mediapipe.

Links to awesome OCR projects

Captcha Recognition

Text language identification using Wikipedia data

STEFANN: Scene Text Editor using Font Adaptive Neural Network