A synthetic data generator for text recognition

Overview

TextRecognitionDataGenerator TravisCI PyPI version codecov Documentation Status mattermost

A synthetic data generator for text recognition

What is it for?

Generating text image samples to train an OCR software. Now supporting non-latin text! For a more thorough tutorial see the official documentation.

What do I need to make it work?

Install the pypi package

pip install trdg

Afterwards, you can use trdg from the CLI. I recommend using a virtualenv instead of installing with sudo.

If you want to add another language, you can clone the repository instead. Simply run pip install -r requirements.txt

Docker image

If you would rather not have to install anything to use TextRecognitionDataGenerator, you can pull the docker image.

docker pull belval/trdg:latest

docker run -v /output/path/:/app/out/ -t belval/trdg:latest trdg [args]

The path (/output/path/) must be absolute.

New

  • Add --stroke_width argument to set the width of the text stroke (Thank you @SunHaozhe)
  • Add --stroke_fill argument to set the color of the text contour if stroke > 0 (Thank you @SunHaozhe)
  • Add --word_split argument to split on word instead of per-character. This is useful for ligature-based languages
  • Add --dict argument to specify a custom dictionary (Thank you @luh0907)
  • Add --font_dir argument to specify the fonts to use
  • Add --output_mask to output character-level mask for each image
  • Add --character_spacing to control space between characters (in pixels)
  • Add python module
  • Add --font to use only one font for all the generated images (Thank you @JulienCoutault!)
  • Add --fit and --margins for finer layout control
  • Change the text orientation using the -or parameter
  • Specify text color range using -tc '#000000,#FFFFFF', please note that the quotes are necessary
  • Add support for Simplified and Traditional Chinese

How does it work?

Words will be randomly chosen from a dictionary of a specific language. Then an image of those words will be generated by using font, background, and modifications (skewing, blurring, etc.) as specified.

Basic (Python module)

The usage as a Python module is very similar to the CLI, but it is more flexible if you want to include it directly in your training pipeline, and will consume less space and memory. There are 4 generators that can be used.

from trdg.generators import (
    GeneratorFromDict,
    GeneratorFromRandom,
    GeneratorFromStrings,
    GeneratorFromWikipedia,
)

# The generators use the same arguments as the CLI, only as parameters
generator = GeneratorFromStrings(
    ['Test1', 'Test2', 'Test3'],
    blur=2,
    random_blur=True
)

for img, lbl in generator:
    # Do something with the pillow images here.

You can see the full class definition here:

Basic (CLI)

trdg -c 1000 -w 5 -f 64

You get 1,000 randomly generated images with random text on them like:

1 2 3 4 5

By default, they will be generated to out/ in the current working directory.

Text skewing

What if you want random skewing? Add -k and -rk (trdg -c 1000 -w 5 -f 64 -k 5 -rk)

6 7 8 9 10

Text distortion

You can also add distorsion to the generated text with -d and -do

23 24 25

Text blurring

But scanned document usually aren't that clear are they? Add -bl and -rbl to get gaussian blur on the generated image with user-defined radius (here 0, 1, 2, 4):

11 12 13 14

Background

Maybe you want another background? Add -b to define one of the three available backgrounds: gaussian noise (0), plain white (1), quasicrystal (2) or image (3).

15 16 17 23

When using image background (3). A image from the images/ folder will be randomly selected and the text will be written on it.

Handwritten

Or maybe you are working on an OCR for handwritten text? Add -hw! (Experimental)

18 19 20 21 22

It uses a Tensorflow model trained using this excellent project by Grzego.

The project does not require TensorFlow to run if you aren't using this feature

Dictionary

The text is chosen at random in a dictionary file (that can be found in the dicts folder) and drawn on a white background made with Gaussian noise. The resulting image is saved as [text]_[index].jpg

There are a lot of parameters that you can tune to get the results you want, therefore I recommend checking out trdg -h for more information.

Create images with Chinese text

It is simple! Just do trdg -l cn -c 1000 -w 5!

Generated texts come both in simplified and traditional Chinese scripts.

Traditional:

27

Simplified:

28

Add new fonts

The script picks a font at random from the fonts directory.

Directory Languages
fonts/latin English, French, Spanish, German
fonts/cn Chinese
fonts/ko Korean

Simply add/remove fonts until you get the desired output.

If you want to add a new non-latin language, the amount of work is minimal.

  1. Create a new folder with your language two-letters code
  2. Add a .ttf font in it
  3. Edit run.py to add an if statement in load_fonts()
  4. Add a text file in dicts with the same two-letters code
  5. Run the tool as you normally would but add -l with your two-letters code

It only supports .ttf for now.

Benchmarks

Number of images generated per second.

  • Intel Core i7-4710HQ @ 2.50Ghz + SSD (-c 1000 -w 1)
    • -t 1 : 363 img/s
    • -t 2 : 694 img/s
    • -t 4 : 1300 img/s
    • -t 8 : 1500 img/s
  • AMD Ryzen 7 1700 @ 4.0Ghz + SSD (-c 1000 -w 1)
    • -t 1 : 558 img/s
    • -t 2 : 1045 img/s
    • -t 4 : 2107 img/s
    • -t 8 : 3297 img/s

Contributing

  1. Create an issue describing the feature you'll be working on
  2. Code said feature
  3. Create a pull request

Feature request & issues

If anything is missing, unclear, or simply not working, open an issue on the repository.

What is left to do?

  • Better background generation
  • Better handwritten text generation
  • More customization parameters (mostly regarding background)
Comments
  • I got this error ,can anyone help me ?

    I got this error ,can anyone help me ?

    1 .error: /data/20180809/TextRecognitionDataGenerator-master/TextRecognitionDataGenerator# python run.py -i "texts/subtitle.txt" -c 100 -w 5 -e png -b 3 Missing modules for handwritten text generation. 31%|#####################################2 | 31/100 [00:00<00:01, 68.48it/s]multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/usr/lib/python3.4/multiprocessing/pool.py", line 119, in worker result = (True, func(*args, **kwds)) File "/data/20180809/TextRecognitionDataGenerator-master/TextRecognitionDataGenerator/data_generator.py", line 22, in generate_from_tuple cls.generate(*t) File "/data/20180809/TextRecognitionDataGenerator-master/TextRecognitionDataGenerator/data_generator.py", line 34, in generate image = ComputerTextGenerator.generate(text, font, text_color) File "/data/20180809/TextRecognitionDataGenerator-master/TextRecognitionDataGenerator/computer_text_generator.py", line 12, in generate image_font = ImageFont.truetype(font=font, size=32) File "/data/20180809/TextRecognitionDataGenerator-master/py3env/lib/python3.4/site-packages/PIL/ImageFont.py", line 261, in truetype return FreeTypeFont(font, size, index, encoding, layout_engine) File "/data/20180809/TextRecognitionDataGenerator-master/py3env/lib/python3.4/site-packages/PIL/ImageFont.py", line 144, in init self.font = core.getfont(font, size, index, encoding, layout_engine=layout_engine) OSError: unknown file format """

    The above exception was the direct cause of the following exception:

    Traceback (most recent call last): File "run.py", line 290, in main() File "run.py", line 278, in main ), total=args.count): File "/data/20180809/TextRecognitionDataGenerator-master/py3env/lib/python3.4/site-packages/tqdm/_tqdm.py", line 930, in iter for obj in iterable: File "/usr/lib/python3.4/multiprocessing/pool.py", line 689, in next raise value OSError: unknown file format

    2.when i use my own background picture,i got the blurry picture,but i want a clear one. image image why they got different width(ps:i use my own texts) Looking forward to hear from you . @Belval

    opened by xieyufei1993 13
  • Generating Images similar to Oxford Synthetic Word Dataset

    Generating Images similar to Oxford Synthetic Word Dataset

    Hi, I am trying to generate images containing single words similar to that in the Oxford Synthetic Word Dataset. The words will also contain symbols such as colon, percentage etc. The process to create the Oxford dataset is described in the below image. process

    I am unsure how to generate such words along with symbols as I get such images below which are very much different from the ones in the Oxford dataset. newsynth1 newsynth4

    Images from Oxford dataset are given below, synth1 synth2

    opened by kevgeo 12
  • How to generate vertical images

    How to generate vertical images

    for example we often read words and characters from left to right. but in Chinese, we sometimes arrange characters from top to bottom. So I just wonder can this code generate top to bottom configuration of Chinese sentences?

    opened by savort 12
  • Add -obb parameter to output bounding boxes

    Add -obb parameter to output bounding boxes

    Follow-up to #107, when using the -obb 1 parameter, the CLI tool will output a txt file with one line per character with the matching bounding boxes.

    Current format is x1 y1 x2 y2 but it could be changed to fit @yyyash8 use case (or better yet, lets make a parameter for it).

    Still need to add tests.

    TEST1_0 out

    4 12 16 27
    17 12 26 27
    28 12 38 27
    38 12 50 27
    51 12 57 27
    

    Code used to render the bounding boxes:

    import sys
    from PIL import Image, ImageDraw
    
    def main(argv):
        img = Image.open(argv[1])
        bboxes = []
        with open(argv[2], "r") as f:
            for line in f.readlines():
                line = line[:-1]
                bboxes.append([int(coord) for coord in line.split(" ")])
    
        d = ImageDraw.Draw(img)
    
        for bbox in bboxes:
            d.rectangle(bbox, outline="green")
    
        img.save("out.png")
    
    if __name__=='__main__':
        main(sys.argv)
    
    opened by Belval 10
  • Generate training pictures for Tibetan ocr

    Generate training pictures for Tibetan ocr

    The result of generating ocr training pictures for Tibetan is as follows, with extra spaces between the text。There should be no spaces in the text of the picture!I have set --space_width=0, what is the reason for this? 9

    opened by hsyy673150343 10
  • Arabic text generator

    Arabic text generator

    Hi,

    File names generated by the Arabic version of the repo are correct as the word letters are connected. However, text in images has disconnected letters and the words started from left to right. The text in an image should be started from right to left and the letter must be connected. Any suggestion on how to correct these issues?

    Thanks

    opened by niddal-imam 10
  • chinese font problem

    chinese font problem

    some chinese fonts can not generate good samples(for example ,some word could not be generated),do you have some suggests to solve the problem .thank you in advance

    opened by DLUTfangping 10
  • Get custom color font with black border

    Get custom color font with black border

    Hi

    I am trying to get this kind of font with white solid colour and black border. Link: https://drive.google.com/file/d/1dwyxDBzRDmQcjU6PL8zc924l4kFDdlKB/view?usp=sharing

    I tried many fonts and what I always get is Link: https://drive.google.com/file/d/1cJXyxB7qS6bOuPpmh3A-UDMShBLLQIG6/view?usp=sharing

    Please tell how can I do this?

    Thanks

    opened by iknoorjobs 8
  • Regex gen, images_dir, image augmentation and more....

    Regex gen, images_dir, image augmentation and more....

    • generate strings with regular expressions
    • specifigy images dir for background generations
    • augment images with imgaug library
    • grayscale param for image grayscaling
    • minl, maxl of strings
    • fonts folder

    Everything should be in readme.

    opened by Cospel 7
  • I got this error, can anyone help me, please?

    I got this error, can anyone help me, please?

    here is the error python run.py -w 5 -f 64 -l am 0%| | 0/1000 [00:00<?, ?it/s]multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/home/test/Anaconda3/envs/py35/lib/python3.5/multiprocessing/pool.py", line 119, in worker result = (True, func(*args, **kwds)) File "/home/test/Documents/direse/scene/TextRecognitionDataGenerator/TextRecognitionDataGenerator/data_generator.py", line 22, in generate_from_tuple cls.generate(*t) File "/home/test/Documents/direse/scene/TextRecognitionDataGenerator/TextRecognitionDataGenerator/data_generator.py", line 36, in generate image = ComputerTextGenerator.generate(text, font, text_color, size, orientation, space_width) File "/home/test/Documents/direse/scene/TextRecognitionDataGenerator/TextRecognitionDataGenerator/computer_text_generator.py", line 9, in generate return cls.__generate_horizontal_text(text, font, text_color, font_size, space_width) File "/home/test/Documents/direse/scene/TextRecognitionDataGenerator/TextRecognitionDataGenerator/computer_text_generator.py", line 17, in __generate_horizontal_text image_font = ImageFont.truetype(font=font, size=font_size) File "/home/test/Anaconda3/envs/py35/lib/python3.5/site-packages/PIL/ImageFont.py", line 261, in truetype return FreeTypeFont(font, size, index, encoding, layout_engine) File "/home/test/Anaconda3/envs/py35/lib/python3.5/site-packages/PIL/ImageFont.py", line 144, in init self.font = core.getfont(font, size, index, encoding, layout_engine=layout_engine) OSError: unknown file format """

    The above exception was the direct cause of the following exception:

    Traceback (most recent call last): File "run.py", line 342, in main() File "run.py", line 330, in main ), total=args.count): File "/home/test/Anaconda3/envs/py35/lib/python3.5/site-packages/tqdm/_tqdm.py", line 930, in iter for obj in iterable: File "/home/test/Anaconda3/envs/py35/lib/python3.5/multiprocessing/pool.py", line 731, in next raise value OSError: unknown file format Exception ignored in: <bound method tqdm.del of 0%| | 0/1000 [00:00<?, ?it/s]> Traceback (most recent call last): File "/home/test/Anaconda3/envs/py35/lib/python3.5/site-packages/tqdm/_tqdm.py", line 882, in del File "/home/test/Anaconda3/envs/py35/lib/python3.5/site-packages/tqdm/_tqdm.py", line 1087, in close File "/home/test/Anaconda3/envs/py35/lib/python3.5/site-packages/tqdm/_tqdm.py", line 439, in _decr_instances File "/home/test/Anaconda3/envs/py35/lib/python3.5/_weakrefset.py", line 109, in remove KeyError: <weakref at 0x7f686b4df598; to 'tqdm' at 0x7f686b53cd30>

    opened by direselign 7
  • Handle `piece_width` for specific language (TH)

    Handle `piece_width` for specific language (TH)

    Some languages (in this case Thai) contain vowels, and tone marks that are not placed horizontally, they are placed on the top or bottom of consonants. This pull request is modified, in order to handle those characters.

    This is an example of before and after results.

    Before

    before

    After

    after

    opened by luangtatipsy 6
  • unable to generate korean handwritten samples

    unable to generate korean handwritten samples

    Hi @Belval I am working on generating korean handwritten samples for training dataset. Still, I am unable to achieve it.

    Does anyone have an idea?

    trdg -l ko -c 1000 -w 5 -hw
    python3 run.py -i ko.txt -l ko -c 1000 -w 5 -fd fonts/ko -hw
    
    opened by khawar-islam 0
  • fixed utf encoding issue

    fixed utf encoding issue

    this fixes issue when generating data for other languages (like arabic)

    reproduce issue with:

    trdg --count 1000 \
        --length 10 \
        --format 64 \
        --output_dir trdg_data/ids \
        --language ar --font_dir TextRecognitionDataGenerator/trdg/fonts/ar/ \
        --skew_angle 15 --random_skew \
        --random \
        --thread_count 8 \
        --distorsion 3 --distorsion_orientation 2 \
        --background 2 \
        --random_blur \
        --output_bboxes 2
    

    this is now fixed with the encoding='utf8' lines

    opened by FarisHijazi 0
  • Arabic Alphabet shapes

    Arabic Alphabet shapes

    first of all I Would to thank you for this awesome project when use the arabic_reshaper repo when generating the the text for fix the Arabic shapes issue , i would produce other issue that if you know the arabic_reshaper repo generate a new shape by compaining chars like the two char (لا) if we reshape it by arabic_reshaper algorithm would generate the char (ﻻ) they seems the same by the first which is the correct one two chars and the other produced by arabic_reshaper algorithm is one char Unicode so they look like the same as a interface or UI but actually they not so i suggest to solve this small issue by just reshaping the arabic text that will put in the image and other one will not reshaping that will be in the labels.txt file

    opened by Mahmuod1 5
  • Unicode symbols and invalid images

    Unicode symbols and invalid images

    colab code

    !mkdir /usr/local/lib/python3.7/dist-packages/trdg/fonts/kzdigits !cp /content/fonts/. /usr/local/lib/python3.7/dist-packages/trdg/fonts/kzdigits/

    e.g. Verdana.ttf

    generator = GeneratorFromRandom( blur=0, random_blur=False, random_skew=True, language='kzdigits' )

    print(generator.generator.fonts) #Verdana.ttf

    Sometimes the generator produces a valid image image

    Sometimes not

    image

    The ₸ sign is not displayed.

    bug 
    opened by avber 2
Releases(v1.6.0)
Owner
Edouard Belval
Software engineering student & machine learning enthusiast, I mostly work with Python, C#, and C/C++.
Edouard Belval
Face Anonymizer - FaceAnonApp v1.0

Face Anonymizer - FaceAnonApp v1.0 Blur faces from image and video files in /data/files folder. Contents Repo of the source files for the FaceAnonApp.

6 Apr 18, 2022
A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes A PyTorch implement of TextSnake: A Flexible Representation for Detecting

Prince Wang 417 Dec 12, 2022
Handwritten_Text_Recognition

Deep Learning framework for Line-level Handwritten Text Recognition Short presentation of our project Introduction Installation 2.a Install conda envi

24 Jul 15, 2022
Assignment work with webcam

work with webcam : Press key 1 to use emojy on your face Press key 2 to use lip and eye on your face Press key 3 to checkered your face Press key 4 to

Hanane Kheirandish 2 May 31, 2022
【Auto】原神⭐钓鱼辅助工具 | 自动收竿、校准游标 | ✨您只需要抛出鱼竿,我们会帮你完成一切✨

原神钓鱼辅助工具 ✨ 作者正在努力重构代码中……会尽快带给大家一个更完美的脚本 ✨ 「您只需抛出鱼竿,然后我们会帮您搞定一切」 如果你觉得这个脚本好用,请点一个 Star ⭐ ,你的 Star 就是作者更新最大的动力 点击这里 查看演示视频 ✨ 欢迎大家在 Issues 中分享自己的配置文件 ✨ ✨

261 Jan 02, 2023
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

Christian Bartz 496 Jan 05, 2023
Using Opencv ,based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching

Using Opencv ,this project is based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching ,it will just mask that image . This project ,if used in cctv

1 Feb 13, 2022
APS 6º Semestre - UNIP (2021)

UNIP - Universidade Paulista Ciência da Computação (CC) DESENVOLVIMENTO DE UM SISTEMA COMPUTACIONAL PARA ANÁLISE E CLASSIFICAÇÃO DE FORMAS Link do git

Eduardo Talarico 5 Mar 09, 2022
Contextual speed detection for python

Speed Prediction using Optical Flow and 2D CNN About the challenge: Comma.AI Speed Challenge This challenge was developed by Comma.AI to predict the s

Mahimana Bhatt 2 Dec 16, 2021
An application of high resolution GANs to dewarp images of perturbed documents

Docuwarp This project is focused on dewarping document images through the usage of pix2pixHD, a GAN that is useful for general image to image translat

Thomas Huang 97 Dec 25, 2022
Semantic-based Patch Detection for Binary Programs

PMatch Semantic-based Patch Detection for Binary Programs Requirement tensorflow-gpu 1.13.1 numpy 1.16.2 scikit-learn 0.20.3 ssdeep 3.4 Usage tar -xvz

Mr.Curiosity 3 Sep 02, 2022
This is a passport scanning web service to help you scan, identify and validate your passport created with a simple and flexible design and ready to be integrated right into your system!

Passport-Recogniton-System This is a passport scanning web service to help you scan, identify and validate your passport created with a simple and fle

Mo'men Ashraf Muhamed 7 Jan 04, 2023
An Implementation of the FOTS: Fast Oriented Text Spotting with a Unified Network

FOTS: Fast Oriented Text Spotting with a Unified Network Introduction This is a pytorch re-implementation of FOTS: Fast Oriented Text Spotting with a

GeorgeJoe 171 Aug 04, 2022
Extract tables from scanned image PDFs using Optical Character Recognition.

ocr-table This project aims to extract tables from scanned image PDFs using Optical Character Recognition. Install Requirements Tesseract OCR sudo apt

Abhijeet Singh 209 Dec 06, 2022
Resizing Canny Countour In Python

Resizing_Canny_Countour Install Visual Studio Code , https://code.visualstudio.com/download Select Python and install with terminal( pip install openc

Walter Ng 1 Nov 07, 2021
(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

BRNet Introduction This is a release of the code of our paper Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds,

86 Oct 05, 2022
2 telegram-bots: for image recognition and for text generation

💻 📱 Telegram_Bots 🔎 & 📖 2 telegram-bots: for image recognition and for text generation. About Image recognition bot: User sends a photo and bot de

Marina Polukoshko 1 Jan 27, 2022
Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Paper source Arbitrary-Oriented Scene Text Detection via Rotation Proposals https://arxiv.org/abs/1703.01086 News We update RRPN in pytorch 1.0! View

428 Nov 22, 2022
EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

EQFace: A Simple Explicit Quality Network for Face Recognition The first face recognition network that generates explicit face quality online.

DeepCam Shenzhen 141 Dec 31, 2022
Localization of thoracic abnormalities model based on VinBigData (top 1%)

Repository contains the code for 2nd place solution of VinBigData Chest X-ray Abnormalities Detection competition. The goal of competition was to auto

33 May 24, 2022