Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Overview

Data Augmentation for Scene Text Recognition (ICCV 2021 Workshop)

(Pronounced as "strog")

Paper

Arxiv

Why it matters?

Scene Text Recognition (STR) requires data augmentation functions that are different from object recognition. STRAug is data augmentation designed for STR. It offers 36 data augmentation functions that are sorted into 8 groups. Each function supports 3 levels or magnitudes of severity or intensity.

Given a source image:

it can be transformed as follows:

  1. warp.py - to generate Curve, Distort, Stretch (or Elastic) deformations
Curve Distort Stretch
  1. geometry.py - to generate Perspective, Rotation, Shrink deformations
Perspective Rotation Shrink
  1. pattern.py - to create different grids: Grid, VGrid, HGrid, RectGrid, EllipseGrid
Grid VGrid HGrid RectGrid EllipseGrid
  1. blur.py - to generate synthetic blur: GaussianBlur, DefocusBlur, MotionBlur, GlassBlur, ZoomBlur
GaussianBlur DefocusBlur MotionBlur GlassBlur ZoomBlur
  1. noise.py - to add noise: GaussianNoise, ShotNoise, ImpulseNoise, SpeckleNoise
GaussianNoise ShotNoise ImpulseNoise SpeckleNoise
  1. weather.py - to simulate certain weather conditions: Fog, Snow, Frost, Rain, Shadow
Fog Snow Frost Rain Shadow
  1. camera.py - to simulate camera sensor tuning and image compression/resizing: Contrast, Brightness, JpegCompression, Pixelate
Contrast Brightness JpegCompression Pixelate
  1. process.py - all other image processing issues: Posterize, Solarize, Invert, Equalize, AutoContrast, Sharpness, Color
Posterize Solarize Invert Equalize
AutoContrast Sharpness Color

Pip install

pip3 install straug

How to use

Command line (e.g. input image is nokia.png):

>>> from straug.warp import Curve
>>> from PIL import Image
>>> img = Image.open("nokia.png")
>>> img = Curve()(img, mag=3)
>>> img.save("curved_nokia.png")

Python script (see test.py):

python3 test.py --image=<target image>

For example:

python3 test.py --image=images/telekom.png

The corrupted images are in results directory.

Reference

  • Image corruptions (eg blur, noise, camera effects, fog, frost, etc) are based on the work of Hendrycks et al.

Citation

If you find this work useful, please cite:

@inproceedings{atienza2021data,
  title={Data Augmentation for Scene Text Recognition},
  author={Atienza, Rowel},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)},
  year={2021},
  pubstate={published},
  tppubtype={inproceedings}
}
Owner
Rowel Atienza
Rowel Atienza
[IROS'21] SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning

SurRoL IROS 2021 SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning Features dVRK compati

<a href=[email protected]"> 55 Jan 03, 2023
An Efficient Implementation of Analytic Mesh Algorithm for 3D Iso-surface Extraction from Neural Networks

AnalyticMesh Analytic Marching is an exact meshing solution from neural networks. Compared to standard methods, it completely avoids geometric and top

Karbo 45 Dec 21, 2022
Tutorial: Introduction to Graph Machine Learning, with Jupyter notebooks

GraphMLTutorialNLDL22 Tutorial NLDL22: Introduction to Graph Machine Learning, with Jupyter notebooks This tutorial takes place during the conference

UiT Machine Learning Group 3 Jan 10, 2022
TextureGAN in Pytorch

TextureGAN This code is our PyTorch implementation of TextureGAN [Project] [Arxiv] TextureGAN is a generative adversarial network conditioned on sketc

Patsorn 147 Dec 14, 2022
Training BERT with Compute/Time (Academic) Budget

Training BERT with Compute/Time (Academic) Budget This repository contains scripts for pre-training and finetuning BERT-like models with limited time

Intel Labs 263 Jan 07, 2023
Author: Wenhao Yu ([email protected]). ACL 2022. Commonsense Reasoning on Knowledge Graph for Text Generation

Diversifying Commonsense Reasoning Generation on Knowledge Graph Introduction -- This is the pytorch implementation of our ACL 2022 paper "Diversifyin

DM2 Lab @ ND 61 Dec 30, 2022
Add-on for importing and auto setup of character creator 3 character exports.

CC3 Blender Tools An add-on for importing and automatically setting up materials for Character Creator 3 character exports. Using Blender in the Chara

260 Jan 05, 2023
This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”

This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?” Usage To replicate our results in Secti

Albert Webson 64 Dec 11, 2022
Data cleaning, missing value handle, EDA use in this project

Lending Club Case Study Project Brief Solving this assignment will give you an idea about how real business problems are solved using EDA. In this cas

Dhruvil Sheth 1 Jan 05, 2022
[ICCV'21] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

PlaneTR: Structure-Guided Transformers for 3D Plane Recovery This is the official implementation of our ICCV 2021 paper News There maybe some bugs in

73 Nov 30, 2022
Preprossing-loan-data-with-NumPy - In this project, I have cleaned and pre-processed the loan data that belongs to an affiliate bank based in the United States.

Preprossing-loan-data-with-NumPy In this project, I have cleaned and pre-processed the loan data that belongs to an affiliate bank based in the United

Dhawal Chitnavis 2 Jan 03, 2022
Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Bidirectional Projection Network for Cross Dimension Scene Understanding CVPR 2021 (Oral) [ Project Webpage ] [ arXiv ] [ Video ] Existing segmentatio

Hu Wenbo 135 Dec 26, 2022
pytorch implementation for PointNet

PointNet.pytorch This repo is implementation for PointNet in pytorch. The model is in pointnet/model.py. It is teste

Fei Xia 1.7k Dec 30, 2022
Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code

STARS Laboratory 8 Sep 14, 2022
ICRA 2021 - Robust Place Recognition using an Imaging Lidar

Robust Place Recognition using an Imaging Lidar A place recognition package using high-resolution imaging lidar. For best performance, a lidar equippe

Tixiao Shan 293 Dec 27, 2022
Seq2seq - Sequence to Sequence Learning with Keras

Seq2seq Sequence to Sequence Learning with Keras Hi! You have just found Seq2Seq. Seq2Seq is a sequence to sequence learning add-on for the python dee

Fariz Rahman 3.1k Dec 18, 2022
RRL: Resnet as representation for Reinforcement Learning

Resnet as representation for Reinforcement Learning (RRL) is a simple yet effective approach for training behaviors directly from visual inputs. We demonstrate that features learned by standard image

Meta Research 21 Dec 07, 2022
This codebase is the official implementation of Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization (NeurIPS2021, Spotlight)

Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization This codebase is the official implementation of Test-Time Classifier A

47 Dec 28, 2022
Template repository for managing machine learning research projects built with PyTorch-Lightning

Tutorial Repository with a minimal example for showing how to deploy training across various compute infrastructure.

Sidd Karamcheti 3 Feb 11, 2022
Do Neural Networks for Segmentation Understand Insideness?

This is part of the code to reproduce the results of the paper Do Neural Networks for Segmentation Understand Insideness? [pdf] by K. Villalobos (*),

biolins 0 Mar 20, 2021