This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Overview

Gated Recurrent Convolution Neural Network for OCR

This project is an implementation of the GRCNN for OCR. For details, please refer to the paper: https://papers.nips.cc/paper/6637-gated-recurrent-convolution-neural-network-for-ocr.pdf

Update

The journal version of GRCNN has been accepted by T-PAMI 2021, and the code is available at:

https://github.com/Jianf-Wang/GRCNN

Build

The GRCNN is built upon the CRNN. The requirements are:

  1. Ubuntu 14.04
  2. CUDA 7.5
  3. CUDNN 5

For the convenience of compiling, we provide the dependencies from here: https://pan.baidu.com/s/1c21zl1e#list/path=%2F

It is more convenient if you use nivdia-docker image (@rremani supplied) : https://hub.docker.com/r/rremani/cuda_crnn_torch/

After installing the dependencies, go to src/ and execute build_cpp.sh to build the C++ code. If successful, a file named libcrnn.so should be produced in the src/ directory.

Inference

We provide the pretrained model from here. Put the downloaded model file into directory model/GRCL/. Moreover, we provide the IC03 dataset in the "./data/IC03" directory. You need to change the directories listed in the "test.txt". The "test_label.txt" is the ground truth of each image. The "lexicon_50.txt" is the lexicon of IC03.

"src/evaluation.lua": Lexicon-free evaluation

"src/evaluation_lex.lua" Lexicon-based evaluation

The evaluation code will output the recognition accuracy.

Train a new model

Follow the following steps to train a new model on your own dataset.

  1. Create a new LMDB dataset.src/create_own_dataset.py(need to pip install lmdb first).
  2. You can modify the configuration in model/GRCL/GRCL_LSTM_pretrain.lua
  3. Go to src/ and execute th main_train.lua ../model/GRCL/ ../model/saved_model. Model snapshots will be saved into ../model/saved_model.

Visualization

We visualize the RCNN , DenseNet and GRCNN to verify the dynamic receptive fields in GRCNN for OCR. There are clearly gaps among different characters, and for each character, the unrelated parts do not provide strong signal.

Citation

@inproceedings{jianfeng2017deep,
 author    = {Wang, Jianfeng and Hu, Xiaolin},
 title     = {Gated Recurrent Convolution Neural Network for OCR},
 booktitle = {Advances in Neural Information Processing Systems},
 year      = {2017}
}
Erosion and dialation using structure element in OpenCV python

Erosion and dialation using structure element in OpenCV python

Tamzid hasan 2 Nov 11, 2021
Satoshi is a discord bot template in python using discord.py that allow you to track some live crypto prices with your own discord bot.

Satoshi ~ DiscordCryptoBot Satoshi is a simple python discord bot using discord.py that allow you to track your favorites cryptos prices with your own

Théo 2 Sep 15, 2022
Controlling Volume by Hand Gestures

This program allows the user to control the volume of their device with specific hand gestures involving their thumb and index finger!

Riddhi Bajaj 1 Nov 11, 2021
Provides OCR (Optical Character Recognition) services through web applications

OCR4all As suggested by the name one of the main goals of OCR4all is to allow basically any given user to independently perform OCR on a wide variety

174 Dec 31, 2022
Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks?

Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks? Artifact Detection/Correction - Offcial PyTorch Implementation This rep

CHOI HWAN IL 23 Dec 20, 2022
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

Christian Bartz 496 Jan 05, 2023
Automatic Number Plate Recognition (ANPR) is a highly accurate system capable of reading vehicle number plates without human intervention

ANPR ANPR is therefore the underlying technology used to find a vehicle license/number plate and it, in turn, supplies this information to a next stag

Melih Emin Kılıçoğlu 1 Jan 09, 2022
This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Gated Recurrent Convolution Neural Network for OCR This project is an implementation of the GRCNN for OCR. For details, please refer to the paper: htt

90 Dec 22, 2022
天池2021"全球人工智能技术创新大赛"【赛道一】:医学影像报告异常检测 - 第三名解决方案

天池2021"全球人工智能技术创新大赛"【赛道一】:医学影像报告异常检测 比赛链接 个人博客记录 目录结构 ├── final------------------------------------决赛方案PPT ├── preliminary_contest--------------------

19 Aug 17, 2022
Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.

doc2text doc2text extracts higher quality text by fixing common scan errors Developing text corpora can be a massive pain in the butt. Much of the tex

Joe Sutherland 1.3k Jan 04, 2023
A python program to block out your face

Readme This is a small program I threw together in about 6 hours to block out your face. It probably doesn't work very well, so be warned. By default,

1 Oct 17, 2021
This project is basically to draw lines with your hand, using python, opencv, mediapipe.

Paint Opencv 📷 This project is basically to draw lines with your hand, using python, opencv, mediapipe. Screenshoots 📱 Tools ⚙️ Python Opencv Mediap

Williams Ismael Bobadilla Torres 3 Nov 17, 2021
Tool which allow you to detect and translate text.

Text detection and recognition This repository contains tool which allow to detect region with text and translate it one by one. Description Two pretr

Damian Panek 176 Nov 28, 2022
OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

Břetislav Hájek 562 Jan 03, 2023
Generate a list of papers with publicly available source code in the daily arxiv

2021-06-08 paper code optimal network slicing for service-oriented networks with flexible routing and guaranteed e2e latency networkslicing multi-moda

79 Jan 03, 2023
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

OCRopus 3.2k Dec 31, 2022
Creating of virtual elements of the graphical interface using opencv and mediapipe.

Virtual GUI Creating of virtual elements of the graphical interface using opencv and mediapipe. Element GUI Output Description Button By default the b

Aleksei 4 Jun 16, 2022
POT : Python Optimal Transport

This open source Python library provide several solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning.

Python Optimal Transport 1.7k Jan 04, 2023
An Implementation of the FOTS: Fast Oriented Text Spotting with a Unified Network

FOTS: Fast Oriented Text Spotting with a Unified Network Introduction This is a pytorch re-implementation of FOTS: Fast Oriented Text Spotting with a

GeorgeJoe 171 Aug 04, 2022
Code for the paper: Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution

Fusformer Code for the paper: "Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution" Plateform Python 3.8.5 + Pytor

Jin-Fan Hu (胡锦帆) 11 Dec 12, 2022