Generate text images for training deep learning ocr model

Last update: Jan 04, 2023

Related tags

Overview

text_renderer

Text Renderer

Generate text images for training deep learning OCR model (e.g. CRNN). Support both latin and non-latin text.

Setup

Ubuntu 16.04
python 3.5+

Install dependencies:

pip3 install -r requirements.txt

Demo

By default, simply run python3 main.py will generate 20 text images and a labels.txt file in output/default/.

Use your own data to generate image

Please run python3 main.py --help to see all optional arguments and their meanings. And put your own data in corresponding folder.
Config text effects and fraction in configs/default.yaml file(or create a new config file and use it by --config_file option), here are some examples:

Effect name	Image
Origin(Font size 25)
Perspective Transform
Random Crop
Curve
Light border
Dark border
Random char space big
Random char space small
Middle line
Table line
Under line
Emboss
Reverse color
Blur
Text color
Line color

Run main.py file.

Strict mode

For no-latin language(e.g Chinese), it's very common that some fonts only support limited chars. In this case, you will get bad results like these:

Select fonts that support all chars in --chars_file is annoying. Run main.py with --strict option, renderer will retry get text from corpus during generate processing until all chars are supported by a font.

Tools

You can use check_font.py script to check how many chars your font not support in --chars_file:

python3 tools/check_font.py

checking font ./data/fonts/eng/Hack-Regular.ttf
chars not supported(4971):
['第', '朱', '广', '沪', '联', '自', '治', '县', '驼', '身', '进', '行', '纳', '税', '防', '火', '墙', '掏', '心', '内', '容', '万', '警','钟', '上', '了', '解'...]
0 fonts support all chars(5071) in ./data/chars/chn.txt:
[]

Generate image using GPU

If you want to use GPU to make generate image faster, first compile opencv with CUDA. Compiling OpenCV with CUDA support

Then build Cython part, and add --gpu option when run main.py

cd libs/gpu
python3 setup.py build_ext --inplace

Debug mode

Run python3 main.py --debug will save images with extract information. You can see how perspectiveTransform works and all bounding/rotated boxes.

Todo

See https://github.com/Sanster/text_renderer/projects/1

Citing text_renderer

If you use text_renderer in your research, please consider use the following BibTeX entry.

@misc{text_renderer,
  author =       {weiqing.chu},
  title =        {text_renderer},
  howpublished = {\url{https://github.com/Sanster/text_renderer}},
  year =         {2021}
}

Generate text images for training deep learning ocr model

Related tags

Overview

New version release：https://github.com/oh-my-ocr/text_renderer

Text Renderer

Setup

Demo

Use your own data to generate image

Strict mode

Tools

Generate image using GPU

Debug mode

Todo

Citing text_renderer

Owner

Qing

A python programusing Tkinter graphics library to randomize questions and answers contained in text files

EAST for ICPR MTWI 2018 Challenge II (Text detection of network images)

This is a GUI program which consist of 4 OpenCV projects

An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

Automatically remove the mosaics in images and videos, or add mosaics to them.

Text language identification using Wikipedia data

轻量级公式 OCR 小工具：一键识别各类公式图片，并转换为 LaTeX 格式

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Distilling Knowledge via Knowledge Review, CVPR 2021

Python bindings for JIGSAW: a Delaunay-based unstructured mesh generator.

Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Educational application aimed at automating user-defined workflows for the mobile game, "Granblue Fantasy", using a variety of CV technologies in the backend such as OpenCV, PyAutoGUI and EasyOCR and a frontend coded in Typescript.

PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

DouZero is a reinforcement learning framework for DouDizhu - 斗地主AI

Text-to-Image generation

Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

Handwriting Recognition System based on a deep Convolutional Recurrent Neural Network architecture

An expandable and scalable OCR pipeline