Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

Overview

SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

Alt text

Introduction

This is a PyTorch implementation of "SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training"

The paper propose a novel text detection system termed SelfText Beyond Polygon(SBP) with Bounding Box Supervision(BBS) and Dynamic Self Training~(DST), where training a polygon-based text detector with only a limited set of upright bounding box annotations. As shown in the Figure, SBP achieves the same performance as strong supervision while saving huge data annotation costs.

From more details,please refer to our arXiv paper

Environments

  • python 3
  • torch = 1.1.0
  • torchvision
  • Pillow
  • numpy

ToDo List

  • Release code(BBS)
  • Release code(DST)
  • Document for Installation
  • Document for testing and training
  • Evaluation
  • Demo script
  • re-organize and clean the parameters

Dataset

Supported:

  • ICDAR15
  • ICDAR17MLI
  • sythtext800K
  • TotalText
  • MSRA-TD500
  • CTW1500

model zoo

Supported text detection:

Bounding Box Supervision(BBS)

Train

The training strategy includes three steps: (1) training SASN with synthetic data (2) generating pseudo label on real data based on bounding box annotation with SASN (3) training the detectors(EAST and PSENet) with the pseudo label

training SASN with synthtext or curved synthtext

(TDB)

generating pseudo label on real data with SASN

(TDB)

training EAST or PSENet with the pseudo label

(TDB)

Eval

for example (batchsize=2)

(TDB)

Visualization

Dynamic Self Training

Train

(TDB)

Eval

for example (batchsize=2)

(TDB)

Visualization

Experiments

Bounding Box Supervision

The performance of EAST on ICDAR15

Method Dataset Pretrain precision recall f-score
EAST_box ICDAR15 - 65.8 63.8 64.8
EAST ICDAR15 - 76.9 77.1 77.0
EAST_pseudo(SynthText) ICDAR15 - 77.8 78.2 78.0
EAST_box ICDAR15 SynthText 70.8 72.0 71.4
EAST ICDAR15 SynthText 82.0 82.4 82.2
EAST_pseudo(SynthText) ICDAR15 SynthText 81.3 82.2 81.8

The performance of EAST on MSRA-TD500

Method Dataset Pretrain precision recall f-score
EAST_box MSRA-TD500 - 40.49 31.05 35.15
EAST MSRA-TD500 - 71.76 69.05 70.38
EAST_pseudo(SynthText) MSRA-TD500 - 71.27 67.54 69.36
EAST_box MSRA-TD500 SynthText 48.34 42.37 45.16
EAST MSRA-TD500 SynthText 77.91 76.45 77.17
EAST_pseudo(SynthText) MSRA-TD500 SynthText 77.42 73.85 75.59

The performance of PSENet on ICDAR15

Method Dataset Pretrain precision recall f-score
PSENet_box ICDAR15 - 70.17 69.09 69.63
PSENet ICDAR15 - 81.6 79.5 80.5
PSENet_pseudo(SynthText) ICDAR15 - 82.9 77.6 80.2
PSENet_box ICDAR15 SynthText 72.65 74.29 73.46
PSENet ICDAR15 SynthText 86.42 83.54 84.96
PSENet_pseudo(SynthText) ICDAR15 SynthText 86.77 83.34 85.02

The performance of PSENet on MSRA-TD500

Method Dataset Pretrain precision recall f-score
PSENet_box MSRA-TD500 - 47.17 36.90 41.41
PSENet MSRA-TD500 - 80.86 77.72 79.13
PSENet_pseudo(SynthText) MSRA-TD500 - 80.32 77.26 78.86
PSENet_box MSRA-TD500 SynthText 47.45 39.49 43.11
PSENet MSRA-TD500 SynthText 84.11 84.97 84.54
PSENet_pseudo(SynthText) MSRA-TD500 SynthText 84.03 84.03 84.03

The performance of PSENet on Total Text

Method Dataset Pretrain precision recall f-score
PSENet_box Total Text - 46.5 43.6 45.0
PSENet Total Text - 80.4 76.5 78.4
PSENet_pseudo(SynthText) Total Text - 80.33 73.54 76.78
PSENet_pseudo(Curved SynthText) Total Text - 81.68 74.61 78.0
PSENet_box Total Text SynthText 51.94 47.45 49.59
PSENet Total Text SynthText 83.4 78.1 80.7
PSENet_pseudo(SynthText) Total Text SynthText 81.57 75.54 78.44
PSENet_pseudo(Curved SynthText) Total Text SynthText 82.51 77.57 80.0

The visualization of bounding-box annotation and the pseudo labels generated by BBS on Total-Text The visualization of bounding-box annotation and the pseudo labels generated by BBS on Total-Text

links

https://github.com/SakuraRiven/EAST

https://github.com/WenmuZhou/PSENet.pytorch

License

For academic use, this project is licensed under the Apache License - see the LICENSE file for details. For commercial use, please contact the authors.

Citations

Please consider citing our paper in your publications if the project helps your research.

Eamil: [email protected]

Owner
weijiawu
computer version, OCR I am looking for a research intern or visiting chance.
weijiawu
Code for “ACE-HGNN: Adaptive Curvature ExplorationHyperbolic Graph Neural Network”

ACE-HGNN: Adaptive Curvature Exploration Hyperbolic Graph Neural Network This repository is the implementation of ACE-HGNN in PyTorch. Environment pyt

9 Nov 28, 2022
A simple log parser and summariser for IIS web server logs

IISLogFileParser A basic parser tool for IIS Logs which summarises findings from the log file. Inspired by the Gist https://gist.github.com/wh13371/e7

2 Mar 26, 2022
CL-Gym: Full-Featured PyTorch Library for Continual Learning

CL-Gym: Full-Featured PyTorch Library for Continual Learning CL-Gym is a small yet very flexible library for continual learning research and developme

Iman Mirzadeh 36 Dec 25, 2022
A Python Library for Graph Outlier Detection (Anomaly Detection)

PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detect

PyGOD Team 757 Jan 04, 2023
This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure recognition.

WTW-Dataset This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on ICCV 2021. Here, you can download the

109 Dec 29, 2022
Code for "The Intrinsic Dimension of Images and Its Impact on Learning" - ICLR 2021 Spotlight

dimensions Estimating the instrinsic dimensionality of image datasets Code for: The Intrinsic Dimensionaity of Images and Its Impact On Learning - Phi

Phil Pope 41 Dec 10, 2022
Modified prey-predator system - Modified prey–predator model describes the rate of change for each species by adding coupling terms.

Modified prey-predator system We aim to study the behaviors of the modified prey–predator model and establish the effects of several parameters that p

Seoyoung Oh 1 Jan 02, 2022
Used to record WKU's utility bills on a regular basis.

WKU水电费小助手 一个用于定期记录WKU水电费的脚本 Looking for English Readme? 背景 由于WKU校园内的水电账单系统时常存在扣费延迟的现象,而补扣的费用缺乏令人信服的证明。不少学生为费用摸不着头脑,但也没有申诉的依据。为了更好地掌握水电费使用情况,留下一手证据,我开源

2 Jul 21, 2022
CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation

CSKG: The CommonSense Knowledge Graph CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation: AT

USC ISI I2 85 Dec 12, 2022
Cache Requests in Deta Bases and Echo them with Deta Micros

Deta Echo Cache Leverage the awesome Deta Micros and Deta Base to cache requests and echo them as needed. Stop worrying about slow public APIs or agre

Gingerbreadfork 8 Dec 07, 2021
A PyTorch Implementation of Gated Graph Sequence Neural Networks (GGNN)

A PyTorch Implementation of GGNN This is a PyTorch implementation of the Gated Graph Sequence Neural Networks (GGNN) as described in the paper Gated G

Ching-Yao Chuang 427 Dec 13, 2022
A collection of SOTA Image Classification Models in PyTorch

A collection of SOTA Image Classification Models in PyTorch

sithu3 85 Dec 30, 2022
A comprehensive list of published machine learning applications to cosmology

ml-in-cosmology This github attempts to maintain a comprehensive list of published machine learning applications to cosmology, organized by subject ma

George Stein 290 Dec 29, 2022
You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks.

AllSet This is the repo for our paper: You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks. We prepared all codes and a subse

Jianhao 51 Dec 24, 2022
Asymmetric metric learning for knowledge transfer

Asymmetric metric learning This is the official code that enables the reproduction of the results from our paper: Asymmetric metric learning for knowl

20 Dec 06, 2022
Unified learning approach for egocentric hand gesture recognition and fingertip detection

Unified Gesture Recognition and Fingertip Detection A unified convolutional neural network (CNN) algorithm for both hand gesture recognition and finge

Mohammad 227 Dec 25, 2022
YOLOv3 in PyTorch > ONNX > CoreML > TFLite

This repository represents Ultralytics open-source research into future object detection methods, and incorporates lessons learned and best practices

Ultralytics 9.3k Jan 07, 2023
Active Offline Policy Selection With Python

Active Offline Policy Selection This is supporting example code for NeurIPS 2021 paper Active Offline Policy Selection by Ksenia Konyushkova*, Yutian

DeepMind 27 Oct 15, 2022
Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu,

GEMS Lab: Graph Exploration & Mining at Scale, University of Michigan 70 Dec 18, 2022
《Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching》(CVPR 2020)

This contains the codes for cross-view geo-localization method described in: Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching, CVPR2020.

41 Oct 27, 2022