Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE

Overview

EAST_ICPR: EAST for ICPR MTWI 2018 CHALLENGE

Introduction

This is a repository forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE.
Origin Repository: argman/EAST - A tensorflow implementation of EAST text detector
Origin Author: argman

Author: Haozheng Li
Email: [email protected] or [email protected]

Contents

  1. Transform
  2. Models
  3. Demo
  4. Train
  5. Test
  6. Results

Transform

Some data in the dataset is abnormal, just like ICPR MTWI 2018[BaiduYun]. Abnormal means that the ground true labels are anticlockwise, or the images are not in 3 channels. Then errors like 'poly in wrong direction' will occur while using argman/EAST.

So I wrote a matlab program to check and transform the dataset. The program named <transform.m> is in the folder 'data_transform/' and its parameters are descripted as bellow:

icpr_img_folder = 'image_9000\';                   %origin images
icpr_txt_folder = 'txt_9000\';                     %origin ground true labels
icdar_img_folder = 'ICPR2018\';                    %transformed images
icdar_gt_folder = 'ICPR2018\';                     %transformed ground true labels
icdar_img_abnormal_folder = 'ICPR2018_abnormal\';  %images not in 3 channels, which give errors in argman/EAST
icdar_gt_abnormal_folder = 'ICPR2018_abnormal\';   %transformed ground true labels

%images and ground true labels files must be renamed as <img_1>, <img_2>, ..., <img_xxx> while using argman/EAST
first_index =  0;                                  %first index of the dataset
transform_list_name = 'transform_list.txt';        %file name of the rename list

Note: For abnormal images not in 3 channels, please transform them to normal ones through other tools like Format Factory. Then add the right data to the <icdar_img_folder> and <icdar_gt_folder>, so finally you get a whole normal dataset which has been checked and transformed.

Models

  1. Resnet_V1_50 Models trained on ICPR MTWI 2018 (train): [100k steps] [500k steps] [1035k steps]
  2. Resnet_V1_101 Models trained on ICPR MTWI 2018 (train) + ICDAR 2017 MLT (train + val) + RCTW-17 (train): [100k steps]
  3. Resnet_V1_101 Models pre-trained on Models-2, then trained on just ICPR MTWI 2018 (train): [987k steps]
  4. (In argman/EAST) Resnet_V1_50 Models trained on ICDAR 2013 (train) + ICDAR 2015 (train): [50k steps]
  5. (In argman/EAST) Resnet_V1_50 Models provided by tensorflow slim: [slim_resnet_v1_50]

Demo

Download the pre-trained models and run:

python run_demo_server.py --checkpoint-path models/east_icpr2018_resnet_v1_50_rbox_1035k/

Then Open http://localhost:8769 for the web demo server, or get the results in 'static/results/'.
Note: See argman/EAST#demo for more details.

Train

Prepare the training set and run:

python multigpu_train.py --gpu_list=0 --input_size=512 --batch_size_per_gpu=8 \
--checkpoint_path=models/east_icpr2018_resnet_v1_50_rbox/ \
--text_scale=512 --training_data_path=data/ICPR2018/ --geometry=RBOX \
--learning_rate=0.0001 --num_readers=18 --max_steps=50000

Note 1: Images and ground true labels files must be renamed as <img_1>, <img_2>, ..., <img_xxx> while using argman/EAST. Please see the examples in the folder 'training_samples/'.
Note 2: If --restore=True, training will restore from checkpoint and ignore the --pretrained_model_path. If --restore=False, training will delete checkpoint and initialize with the --pretrained_model_path (if exists).
Note 3: See argman/EAST#train for more details.

Test

Names of the images in ICPR MTWI 2018 are abnormal. Like <LB1gXi2JVXXXXXUXFXXXXXXXXXX.jpg> but not <img_10001.jpg>. Then errors will occur while using argman/EAST#test.

So I wrote two matlab programs to rename and inversely rename the dataset. Before evaluating, run the program named <rename.m> to make names of the images normal. This program is in the folder 'data_transform/' and its parameters are descripted as bellow:

icpr_img_folder = 'image_10000\';                      %origin images
icdar_img_folder = 'ICPR2018_test\';                   %transformed images
icdar_img_abnormal_folder = 'ICPR2018_test_abnormal\'; %images not in 3 channels, which give errors in argman/EAST

icpr_count =  10000;                                   %first index of the dataset
rename_list_name = 'rename_list.txt';                  %file name of the rename list

Note: Just like <transform.m>, please transform abnormal images through other tools like Format Factory.

After you have prepared the test set, run:

python eval.py --test_data_path=data/ICPR2018/ --gpu_list=0 \
--checkpoint_path=models/east_icpr2018_resnet_v1_50_rbox_1035k/ --output_dir=results/1035k/

Then get the results in 'results/'.

Finally inversely rename the result labels files from <img_10001.txt> to <LB1gXi2JVXXXXXUXFXXXXXXXXXX.txt> according to the rename list generated by <rename.m>. Run the program named <rename_inverse.m> which is in the folder 'data_transform/' and its parameters are descripted as bellow:

rename_list_name = 'rename_list.txt';  %file name of the rename list
icpr_img_folder = 'image_10000\';      %origin images
icpr_txt_folder = 'results\';          %result labels files generated by argman/EAST
icdar_gt_folder = 'txt_10000\';        %inversely renamed result labels files

Then zip the results in 'txt_10000/' and submit it to the ICPR MTWI 2018 CHALLENGE.

Results

Finally our model <east_icpr2018_resnet_v1_50_rbox_1035k> rank 31 in the ICPR MTWI 2018 CHALLENGE:
image

Here are some results on ICPR MTWI 2018:
image
image
image
image
image

Have fun!! :)

Owner
Haozheng Li
Computer Vision, Reinforcement Learning
Haozheng Li
Just a script for detecting the lanes in any car game (not just gta 5) with specific resolution and road design ( very basic and limited )

GTA-5-Lane-detection Just a script for detecting the lanes in any car game (not just gta 5) with specific resolution and road design ( very basic and

Danciu Georgian 4 Aug 01, 2021
aardio的opencv库

opencv_aardio dll库下载地址:https://github.com/xuncv/opencv-plugin/releases import cv2 img = cv2.imread("./images/Lena.jpg",1) img = cv2.medianBlur(img,5)

71 Dec 31, 2022
Motion Detection Squid Game with OpenCV Python

*Motion Detection Squid Game with OpenCV Python i am newbie in python. In this project I made a simple game to follow the trend about the red light gr

Nayan 17 Nov 22, 2022
Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that slide and lock together.

Fusion-360-Add-In-PuzzleSpline Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that sli

Michiel van Wessem 1 Nov 15, 2021
Binarize document images

Binarization Binarization for document images Examples Introduction This tool performs document image binarization (i.e. transform colour/grayscale to

QURATOR-SPK 48 Jan 02, 2023
PAGE XML format collection for document image page content and more

PAGE-XML PAGE XML format collection for document image page content and more For an introduction, please see the following publication: http://www.pri

PRImA Research Lab 46 Nov 14, 2022
A list of hyperspectral image super-solution resources collected by Junjun Jiang

A list of hyperspectral image super-resolution resources collected by Junjun Jiang. If you find that important resources are not included, please feel free to contact me.

Junjun Jiang 301 Jan 05, 2023
Repository for Scene Text Detection with Supervised Pyramid Context Network with tensorflow.

Scene-Text-Detection-with-SPCNET Unofficial repository for [Scene Text Detection with Supervised Pyramid Context Network][https://arxiv.org/abs/1811.0

121 Oct 15, 2021
This repo contains several opencv projects done while learning opencv in python.

opencv-projects-python This repo contains both several opencv projects done while learning opencv by python and opencv learning resources [Basic conce

Fatin Shadab 2 Nov 03, 2022
Toolbox for OCR post-correction

Ochre Ochre is a toolbox for OCR post-correction. Please note that this software is experimental and very much a work in progress! Overview of OCR pos

National Library of the Netherlands / Research 117 Nov 10, 2022
利用Paddle框架复现CRAFT

CRAFT-Paddle 利用Paddle框架复现CRAFT CRAFT 本项目基于paddlepaddle框架复现CRAFT,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: CRAFT: Character-Region Awarenes

QuanHao Guo 2 Mar 07, 2022
Run tesseract with the tesserocr bindings with @OCR-D's interfaces

ocrd_tesserocr Crop, deskew, segment into regions / tables / lines / words, or recognize with tesserocr Introduction This package offers OCR-D complia

OCR-D 38 Oct 14, 2022
"Very simple but works well" Computer Vision based ID verification solution provided by LibraX.

ID Verification by LibraX.ai This is the first free Identity verification in the market. LibraX.ai is an identity verification platform for developers

LibraX.ai 46 Dec 06, 2022
Generates a message from the infamous Jerma Impostor image

Generate your very own jerma sus imposter message. Modes: Default Mode: Only supports the characters " ", !, a, b, c, d, e, h, i, m, n, o, p, q, r, s,

Giorno420 1 Oct 27, 2022
The code for “Oriented RepPoints for Aerail Object Detection”

Oriented RepPoints for Aerial Object Detection The code for the implementation of “Oriented RepPoints”, Under review. (arXiv preprint) Introduction Or

WentongLi 207 Dec 24, 2022
Lightning Fast Language Prediction 🚀

whatthelang Lightning Fast Language Prediction 🚀 Dependencies The dependencies can be installed using the requirements.txt file: $ pip install -r req

Indix 152 Oct 16, 2022
An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

InceptText-Tensorflow An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Orien

GeorgeJoe 115 Dec 12, 2022
🔎 Like Chardet. 🚀 Package for encoding & language detection. Charset detection.

Charset Detection, for Everyone 👋 The Real First Universal Charset Detector A library that helps you read text from an unknown charset encoding. Moti

TAHRI Ahmed R. 332 Dec 31, 2022
OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

Břetislav Hájek 562 Jan 03, 2023
Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"

TableNet Unofficial implementation of ICDAR 2019 paper : TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from

Jainam Shah 243 Dec 30, 2022