This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Last update: Dec 22, 2022

Related tags

Overview

Gated Recurrent Convolution Neural Network for OCR

This project is an implementation of the GRCNN for OCR. For details, please refer to the paper: https://papers.nips.cc/paper/6637-gated-recurrent-convolution-neural-network-for-ocr.pdf

Update

The journal version of GRCNN has been accepted by T-PAMI 2021, and the code is available at:

https://github.com/Jianf-Wang/GRCNN

Build

The GRCNN is built upon the CRNN. The requirements are:

Ubuntu 14.04
CUDA 7.5
CUDNN 5

For the convenience of compiling, we provide the dependencies from here: https://pan.baidu.com/s/1c21zl1e#list/path=%2F

It is more convenient if you use nivdia-docker image (@rremani supplied) : https://hub.docker.com/r/rremani/cuda_crnn_torch/

After installing the dependencies, go to src/ and execute build_cpp.sh to build the C++ code. If successful, a file named libcrnn.so should be produced in the src/ directory.

Inference

We provide the pretrained model from here. Put the downloaded model file into directory model/GRCL/. Moreover, we provide the IC03 dataset in the "./data/IC03" directory. You need to change the directories listed in the "test.txt". The "test_label.txt" is the ground truth of each image. The "lexicon_50.txt" is the lexicon of IC03.

"src/evaluation.lua": Lexicon-free evaluation

"src/evaluation_lex.lua" Lexicon-based evaluation

The evaluation code will output the recognition accuracy.

Train a new model

Follow the following steps to train a new model on your own dataset.

Create a new LMDB dataset.src/create_own_dataset.py(need to pip install lmdb first).
You can modify the configuration in model/GRCL/GRCL_LSTM_pretrain.lua
Go to src/ and execute th main_train.lua ../model/GRCL/ ../model/saved_model. Model snapshots will be saved into ../model/saved_model.

Visualization

We visualize the RCNN , DenseNet and GRCNN to verify the dynamic receptive fields in GRCNN for OCR. There are clearly gaps among different characters, and for each character, the unrelated parts do not provide strong signal.

Citation

@inproceedings{jianfeng2017deep,
 author    = {Wang, Jianfeng and Hu, Xiaolin},
 title     = {Gated Recurrent Convolution Neural Network for OCR},
 booktitle = {Advances in Neural Information Processing Systems},
 year      = {2017}
}

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Related tags

Overview

Gated Recurrent Convolution Neural Network for OCR

Update

Build

Inference

Train a new model

Visualization

Citation

Owner

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports

A simple Security Camera created using Opencv in Python where images gets saved in realtime in your Dropbox account at every 5 seconds

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

The Open Source Framework for Machine Vision

Automatically fishes for you while you are afk :)

Controlling Volume by Hand Gestures

Detect handwritten words in a text-line (classic image processing method).

第一届西安交通大学人工智能实践大赛（2018AI实践大赛--图片文字识别）第一名；仅采用densenet识别图中文字

✌️Using this you can control your PC/Laptop volume by Hand Gestures created with Python.

This is the code for our paper DAAIN: Detection of Anomalous and AdversarialInput using Normalizing Flows

This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

Controlling the computer volume with your hands // OpenCV

Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

A tool to make dumpy among us GIFS

Natural language detection

Extract tables from scanned image PDFs using Optical Character Recognition.

Generate a list of papers with publicly available source code in the daily arxiv

Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125