Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Overview

Data Augmentation for Scene Text Recognition (ICCV 2021 Workshop)

(Pronounced as "strog")

Paper

Arxiv

Why it matters?

Scene Text Recognition (STR) requires data augmentation functions that are different from object recognition. STRAug is data augmentation designed for STR. It offers 36 data augmentation functions that are sorted into 8 groups. Each function supports 3 levels or magnitudes of severity or intensity.

Given a source image:

it can be transformed as follows:

  1. warp.py - to generate Curve, Distort, Stretch (or Elastic) deformations
Curve Distort Stretch
  1. geometry.py - to generate Perspective, Rotation, Shrink deformations
Perspective Rotation Shrink
  1. pattern.py - to create different grids: Grid, VGrid, HGrid, RectGrid, EllipseGrid
Grid VGrid HGrid RectGrid EllipseGrid
  1. blur.py - to generate synthetic blur: GaussianBlur, DefocusBlur, MotionBlur, GlassBlur, ZoomBlur
GaussianBlur DefocusBlur MotionBlur GlassBlur ZoomBlur
  1. noise.py - to add noise: GaussianNoise, ShotNoise, ImpulseNoise, SpeckleNoise
GaussianNoise ShotNoise ImpulseNoise SpeckleNoise
  1. weather.py - to simulate certain weather conditions: Fog, Snow, Frost, Rain, Shadow
Fog Snow Frost Rain Shadow
  1. camera.py - to simulate camera sensor tuning and image compression/resizing: Contrast, Brightness, JpegCompression, Pixelate
Contrast Brightness JpegCompression Pixelate
  1. process.py - all other image processing issues: Posterize, Solarize, Invert, Equalize, AutoContrast, Sharpness, Color
Posterize Solarize Invert Equalize
AutoContrast Sharpness Color

Pip install

pip3 install straug

How to use

Command line (e.g. input image is nokia.png):

>>> from straug.warp import Curve
>>> from PIL import Image
>>> img = Image.open("nokia.png")
>>> img = Curve()(img, mag=3)
>>> img.save("curved_nokia.png")

Python script (see test.py):

python3 test.py --image=<target image>

For example:

python3 test.py --image=images/telekom.png

The corrupted images are in results directory.

Reference

  • Image corruptions (eg blur, noise, camera effects, fog, frost, etc) are based on the work of Hendrycks et al.

Citation

If you find this work useful, please cite:

@inproceedings{atienza2021data,
  title={Data Augmentation for Scene Text Recognition},
  author={Atienza, Rowel},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)},
  year={2021},
  pubstate={published},
  tppubtype={inproceedings}
}
Owner
Rowel Atienza
Rowel Atienza
Code for our ALiBi method for transformer language models.

Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation This repository contains the code and models for our paper Tra

Ofir Press 211 Dec 31, 2022
TensorFlow (v2.7.0) benchmark results on an M1 Macbook Air 2020 laptop (macOS Monterey v12.1).

M1-tensorflow-benchmark TensorFlow (v2.7.0) benchmark results on an M1 Macbook Air 2020 laptop (macOS Monterey v12.1). I was initially testing if Tens

particle 2 Jan 05, 2022
Code for the RA-L (ICRA) 2021 paper "SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition"

SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition [ArXiv+Supplementary] [IEEE Xplore RA-L 2021] [ICRA 2021 YouTube Video]

Sourav Garg 63 Dec 12, 2022
Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

Semantic-NeRF: Semantic Neural Radiance Fields Project Page | Video | Paper | Data In-Place Scene Labelling and Understanding with Implicit Scene Repr

Shuaifeng Zhi 243 Jan 07, 2023
🌾 PASTIS 🌾 Panoptic Agricultural Satellite TIme Series

🌾 PASTIS 🌾 Panoptic Agricultural Satellite TIme Series (optical and radar) The PASTIS Dataset Dataset presentation PASTIS is a benchmark dataset for

86 Jan 04, 2023
Implementation of the final project of the course DDA6309 Probabilistic Graphical Model

Task-aware Joint CWS and POS (TCwsPos) This is the implementation of the final project of the course DDA6309 Probabilistic Graphical Models, The Chine

Peng 1 Dec 26, 2021
CRNN With PyTorch

CRNN-PyTorch Implementation of https://arxiv.org/abs/1507.05717

Vadim 4 Sep 01, 2022
Related resources for our EMNLP 2021 paper

Plan-then-Generate: Controlled Data-to-Text Generation via Planning Authors: Yixuan Su, David Vandyke, Sihui Wang, Yimai Fang, and Nigel Collier Code

Yixuan Su 61 Jan 03, 2023
T-LOAM: Truncated Least Squares Lidar-only Odometry and Mapping in Real-Time

T-LOAM: Truncated Least Squares Lidar-only Odometry and Mapping in Real-Time The first Lidar-only odometry framework with high performance based on tr

Pengwei Zhou 183 Dec 01, 2022
A user-friendly research and development tool built to standardize RL competency assessment for custom agents and environments.

Built with ❤️ by Sam Showalter Contents Overview Installation Dependencies Usage Scripts Standard Execution Environment Development Environment Benchm

SRI-AIC 1 Nov 18, 2021
Jingju baseline - A baseline model of our project of Beijing opera script generation

Jingju Baseline It is a baseline of our project about Beijing opera script gener

midon 1 Jan 14, 2022
Neural Architecture Search Powered by Swarm Intelligence 🐜

Neural Architecture Search Powered by Swarm Intelligence 🐜 DeepSwarm DeepSwarm is an open-source library which uses Ant Colony Optimization to tackle

288 Oct 28, 2022
Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021

ATLOP Code for AAAI 2021 paper Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling. If you make use of this co

Wenxuan Zhou 146 Nov 29, 2022
DEEPAGÉ: Answering Questions in Portuguese about the Brazilian Environment

DEEPAGÉ: Answering Questions in Portuguese about the Brazilian Environment This repository is related to the paper DEEPAGÉ: Answering Questions in Por

0 Dec 10, 2021
Parameter Efficient Deep Probabilistic Forecasting

PEDPF Parameter Efficient Deep Probabilistic Forecasting (PEDPF) is a repository containing code to run experiments for several deep learning based pr

Olivier Sprangers 10 Jun 13, 2022
Cours d'Algorithmique Appliquée avec Python pour BTS SIO SISR

Course: Introduction to Applied Algorithms with Python (in French) This is the source code of the website for the Applied Algorithms with Python cours

Loic Yvonnet 0 Jan 27, 2022
This repository provides an efficient PyTorch-based library for training deep models.

s3sec Test AWS S3 buckets for read/write/delete access This tool was developed to quickly test a list of s3 buckets for public read, write and delete

Bytedance Inc. 123 Jan 05, 2023
Tensorflow implementation of "BEGAN: Boundary Equilibrium Generative Adversarial Networks"

BEGAN in Tensorflow Tensorflow implementation of BEGAN: Boundary Equilibrium Generative Adversarial Networks. Requirements Python 2.7 or 3.x Pillow tq

Taehoon Kim 922 Dec 21, 2022
pytorch implementation of trDesign

trdesign-pytorch This repository is a PyTorch implementation of the trDesign paper based on the official TensorFlow implementation. The initial port o

Learn Ventures Inc. 41 Dec 29, 2022
Manipulation OpenAI Gym environments to simulate robots at the STARS lab

Manipulator Learning This repository contains a set of manipulation environments that are compatible with OpenAI Gym and simulated in pybullet. In par

STARS Laboratory 5 Dec 08, 2022