Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

Last update: Dec 21, 2022

Related tags

Deep Learning rotated-box-is-back

Overview

Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

This material is supplementray code for paper accepted in ICDAR 2021

We highly recommend to use docker image because our model contains custom operation which depends on framework and cuda version.
We provide trained model for ICDAR 2017, 2013 which is in final_checkpoint_ch8 and for ICDAR 2015 which is in final_checkpoint_ch4
This code is mainly focused on inference. To train our model, training gpu like V100 is needed. please check our paper in detail.

REQUIREMENT

Nvidia-docker
Tensorflow 1.14
Miminum GPU requirement : NVIDIA GTX 1080TI

INSTALLATION

Make docker image and container

docker build --tag rbimage ./dockerfile
docker run --runtime=nvidia --name rbcontainer -v /rotated-box-is-back-path:/rotated-box-is-back -i -t rbimage /bin/bash

build custom operations in container

cd /rotated-box-is-back/nms 
cmake ./
make
./shell.sh

SAMPLE IMAGE INFERENCE

cd /rotated-box-is-back/
python viz.py --test_data_path=./sample --checkpoint_path=./final_checkpoint_ch8 --output_dir=./sample_result  --thres 0.6 --min_size=1600 --max_size=2000

ICDAR 2017 INFERENCE

please replace icdar_testset_path to your-icdar-2017-testset-folder path.

python viz.py --test_data_path=icdar_testset_path --checkpoint_path=./final_checkpoint_ch8 --output_dir=./ic17  --thres 0.6 --min_size=1600 --max_size=2000

ICDAR 2015 INFERENCE

please replace icdar_testset_path to your-icdar-2015-testset-folder path.
To converting evalutation format. Convert result text file like below

python viz.py --test_data_path=icdar_testset_path --checkpoint_path=./final_checkpoint_ch4 --output_dir=./ic15  --thres 0.7 --min_size=1100 --max_size=2000
python text_postprocessing.py -i=./ic15/ -o=./ic15_format/ -e True

ICDAR 2013 INFERENCE

please replace icdar_testset_path to your-icdar-2013-testset-folder path.
To converting evalutation format. Convert result text file like below

python viz.py --test_data_path=icdar_testset_path --checkpoint_path=./final_checkpoint_ch8 --output_dir=./ic13  --thres 0.55 --min_size=700 --max_size=900
python text_postprocessing.py -i=./ic13/ -o=./ic13_format/ -e True -m rec

EVALUATION TABLE

IC13			IC15			IC17
P	R	F	P	R	F	P	R	F
95.9	89.1	92.4	89.7	84.2	86.9	83.4	68.2	75.0

TRAINING

It can be trained below command line

python train_refine_estimator.py --input_size=1024 --batch_size=2 --checkpoint_path=./finetuning --training_data_path=your-image-path --training_gt_path=your-gt-path  --learning_rate=0.00001 --max_epochs=500  --save_summary_steps=1000 --warmup_path=./final_checkpoint_ch8

ACKNOWLEDGEMENT

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 1711125972, Audio-Visual Perception for Autonomous Rescue Drones).

CITATION

If you found it is helpfull for your research, please cite:

Lee J., Lee J., Yang C., Lee Y., Lee J. (2021) Rotated Box Is Back: An Accurate Box Proposal Network for Scene Text Detection. In: Lladós J., Lopresti D., Uchida S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science, vol 12824. Springer, Cham. https://doi.org/10.1007/978-3-030-86337-1_4

Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

Related tags

Overview

Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

This material is supplementray code for paper accepted in ICDAR 2021

REQUIREMENT

INSTALLATION

SAMPLE IMAGE INFERENCE

ICDAR 2017 INFERENCE

ICDAR 2015 INFERENCE

ICDAR 2013 INFERENCE

EVALUATION TABLE

TRAINING

ACKNOWLEDGEMENT

CITATION

Owner

NCSOFT

Real-Time Social Distance Monitoring tool using Computer Vision

FewBit — a library for memory efficient training of large neural networks

All of the figures and notebooks for my deep learning book, for free!

Final project for Intro to CS class.

Meaningful titles for tabs and PDF downloads! Also supports tab search.

EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers

Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)

Code release for NeurIPS 2020 paper "Co-Tuning for Transfer Learning"

Some toy examples of score matching algorithms written in PyTorch

[ICLR'21] Counterfactual Generative Networks

A2LP for short, ECCV2020 spotlight, Investigating SSL principles for UDA problems

Companion repository to the paper accepted at the 4th ACM SIGSPATIAL International Workshop on Advances in Resilient and Intelligent Cities

HAR-stacked-residual-bidir-LSTMs - Deep stacked residual bidirectional LSTMs for HAR

DSL for matching Python ASTs

Official and maintained implementation of the paper "OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data" [BMVC 2021].

An implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch

A PyTorch Implementation of "Watch Your Step: Learning Node Embeddings via Graph Attention" (NeurIPS 2018).

GUPNet - Geometry Uncertainty Projection Network for Monocular 3D Object Detection

A Tensorflow implementation of CapsNet based on Geoffrey Hinton's paper Dynamic Routing Between Capsules