SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

Introduction

This is a PyTorch implementation of "SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training"

The paper propose a novel text detection system termed SelfText Beyond Polygon(SBP) with Bounding Box Supervision(BBS) and Dynamic Self Training~(DST), where training a polygon-based text detector with only a limited set of upright bounding box annotations. As shown in the Figure, SBP achieves the same performance as strong supervision while saving huge data annotation costs.

From more details,please refer to our arXiv paper

Environments

python 3
torch = 1.1.0
torchvision
Pillow
numpy

ToDo List

Dataset

Supported:

model zoo

Supported text detection:

Bounding Box Supervision(BBS)

Train

The training strategy includes three steps: (1) training SASN with synthetic data (2) generating pseudo label on real data based on bounding box annotation with SASN (3) training the detectors(EAST and PSENet) with the pseudo label

training SASN with synthtext or curved synthtext

(TDB)

generating pseudo label on real data with SASN

(TDB)

training EAST or PSENet with the pseudo label

(TDB)

Eval

for example (batchsize=2)

(TDB)

Visualization

Dynamic Self Training

Train

(TDB)

Eval

for example (batchsize=2)

(TDB)

Visualization

Experiments

Bounding Box Supervision

The performance of EAST on ICDAR15

Method	Dataset	Pretrain	precision	recall	f-score
EAST_box	ICDAR15	-	65.8	63.8	64.8
EAST	ICDAR15	-	76.9	77.1	77.0
EAST_pseudo(SynthText)	ICDAR15	-	77.8	78.2	78.0
EAST_box	ICDAR15	SynthText	70.8	72.0	71.4
EAST	ICDAR15	SynthText	82.0	82.4	82.2
EAST_pseudo(SynthText)	ICDAR15	SynthText	81.3	82.2	81.8

The performance of EAST on MSRA-TD500

Method	Dataset	Pretrain	precision	recall	f-score
EAST_box	MSRA-TD500	-	40.49	31.05	35.15
EAST	MSRA-TD500	-	71.76	69.05	70.38
EAST_pseudo(SynthText)	MSRA-TD500	-	71.27	67.54	69.36
EAST_box	MSRA-TD500	SynthText	48.34	42.37	45.16
EAST	MSRA-TD500	SynthText	77.91	76.45	77.17
EAST_pseudo(SynthText)	MSRA-TD500	SynthText	77.42	73.85	75.59

The performance of PSENet on ICDAR15

Method	Dataset	Pretrain	precision	recall	f-score
PSENet_box	ICDAR15	-	70.17	69.09	69.63
PSENet	ICDAR15	-	81.6	79.5	80.5
PSENet_pseudo(SynthText)	ICDAR15	-	82.9	77.6	80.2
PSENet_box	ICDAR15	SynthText	72.65	74.29	73.46
PSENet	ICDAR15	SynthText	86.42	83.54	84.96
PSENet_pseudo(SynthText)	ICDAR15	SynthText	86.77	83.34	85.02

The performance of PSENet on MSRA-TD500

Method	Dataset	Pretrain	precision	recall	f-score
PSENet_box	MSRA-TD500	-	47.17	36.90	41.41
PSENet	MSRA-TD500	-	80.86	77.72	79.13
PSENet_pseudo(SynthText)	MSRA-TD500	-	80.32	77.26	78.86
PSENet_box	MSRA-TD500	SynthText	47.45	39.49	43.11
PSENet	MSRA-TD500	SynthText	84.11	84.97	84.54
PSENet_pseudo(SynthText)	MSRA-TD500	SynthText	84.03	84.03	84.03

The performance of PSENet on Total Text

Method	Dataset	Pretrain	precision	recall	f-score
PSENet_box	Total Text	-	46.5	43.6	45.0
PSENet	Total Text	-	80.4	76.5	78.4
PSENet_pseudo(SynthText)	Total Text	-	80.33	73.54	76.78
PSENet_pseudo(Curved SynthText)	Total Text	-	81.68	74.61	78.0
PSENet_box	Total Text	SynthText	51.94	47.45	49.59
PSENet	Total Text	SynthText	83.4	78.1	80.7
PSENet_pseudo(SynthText)	Total Text	SynthText	81.57	75.54	78.44
PSENet_pseudo(Curved SynthText)	Total Text	SynthText	82.51	77.57	80.0

The visualization of bounding-box annotation and the pseudo labels generated by BBS on Total-Text

links

https://github.com/SakuraRiven/EAST

https://github.com/WenmuZhou/PSENet.pytorch

License

For academic use, this project is licensed under the Apache License - see the LICENSE file for details. For commercial use, please contact the authors.

Citations

Please consider citing our paper in your publications if the project helps your research.

Eamil: [email protected]

Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

Related tags

Overview

SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

Introduction

Environments

ToDo List

Dataset

model zoo

Bounding Box Supervision(BBS)

Train

training SASN with synthtext or curved synthtext

generating pseudo label on real data with SASN

training EAST or PSENet with the pseudo label

Eval

Visualization

Dynamic Self Training

Train

Eval

Visualization

Experiments

Bounding Box Supervision

The performance of EAST on ICDAR15

The performance of EAST on MSRA-TD500

The performance of PSENet on ICDAR15

The performance of PSENet on MSRA-TD500

The performance of PSENet on Total Text

links

License

Citations

Owner

weijiawu

Alfred-Restore-Iterm-Arrangement - An Alfred workflow to restore iTerm2 window Arrangements

BookMyShowPC - Movie Ticket Reservation App made with Tkinter

Underwater image enhancement

A Python framework for developing parallelized Computational Fluid Dynamics software to solve the hyperbolic 2D Euler equations on distributed, multi-block structured grids.

Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach

Source Code for AAAI 2022 paper "Graph Convolutional Networks with Dual Message Passing for Subgraph Isomorphism Counting and Matching"

Code for the paper "Adapting Monolingual Models: Data can be Scarce when Language Similarity is High"

PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021.

We present a regularized self-labeling approach to improve the generalization and robustness properties of fine-tuning.

[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

All course materials for the Zero to Mastery Deep Learning with TensorFlow course.

AlgoVision - A Framework for Differentiable Algorithms and Algorithmic Supervision

Alpha-Zero - Telegram Group Manager Bot Written In Python Using Pyrogram

Offcial repository for the IEEE ICRA 2021 paper Auto-Tuned Sim-to-Real Transfer.

The world's simplest facial recognition api for Python and the command line

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

official code for dynamic convolution decomposition

[ACM MM 2021] Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation)

A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution