This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Last update: Dec 30, 2022

Overview

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network

Introduction

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.

Thanks for the author's (@whai362) awesome work!

Installation

Any version of tensorflow version > 1.0 should be ok.
python 2 or 3 will be ok.

Download

trained on ICDAR 2015 (training set) + ICDAR2017 MLT (training set):

baiduyun extract code: pffd

google drive

This model is not as good as article's, it's just a reference. You can finetune on it or you can do a lot of optimization based on this code.

Database	Precision (%)	Recall (%)	F-measure (%)
ICDAR 2015(val)	74.61	80.93	77.64

Train

If you want to train the model, you should provide the dataset path, in the dataset path, a separate gt text file should be provided for each image, and make sure that gt text and image file have the same names.

Then run train.py like:

python train.py --gpu_list=0 --input_size=512 --batch_size_per_gpu=8 --checkpoint_path=./resnet_v1_50/ \
--training_data_path=./data/ocr/icdar2015/

If you have more than one gpu, you can pass gpu ids to gpu_list(like --gpu_list=0,1,2,3)

Note:

right now , only support icdar2017 data format input, like (116,1179,206,1179,206,1207,116,1207,"###"), but you can modify data_provider.py to support polygon format input
Already support polygon shrink by using pyclipper module
this re-implementation is just for fun, but I'll continue to improve this code.
re-implementation pse algorithm by using c++ (if you use python2, just run it, if python3, please replace python-config with python3-config in makefile)

Test

run eval.py like:

python eval.py --test_data_path=./tmp/images/ --gpu_list=0 --checkpoint_path=./resnet_v1_50/ \
--output_dir=./tmp/

a text file and result image will be then written to the output path.

Examples

About issues

If you encounter any issue check issues first, or you can open a new issue.

Reference

Acknowledge

@rkshuai found a bug about concat features in model.py.

If this repository helps you，please star it. Thanks.

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Related tags

Overview

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network

Introduction

Installation

Download

Train

Test

Examples

About issues

Reference

Acknowledge

Owner

Michael liu

Repository for playing the computer vision apps: People analytics on Raspberry Pi.

Dirty, ugly, and hopefully useful OCR of Facebook Papers docs released by Gizmodo

a micro OCR network with 0.07mb params.

Document Layout Analysis Projects

Learn computer graphics by writing GPU shaders!

Face Recognizer using Opencv Python

This tool will help you convert your text to handwriting xD

Apply different text recognition services to images of handwritten documents.

This project modify tensorflow object detection api code to predict oriented bounding boxes. It can be used for scene text detection.

Machine Leaning applied to denoise images to improve OCR Accuracy

7th place solution

A Python wrapper for the tesseract-ocr API

Generates a message from the infamous Jerma Impostor image

Controlling Volume by Hand Gestures

Volume Control using OpenCV

[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

3点クリックで円を指定し、極座標変換を行うサンプルプログラム

A webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV.

"Very simple but works well" Computer Vision based ID verification solution provided by LibraX.

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.