Links to awesome OCR projects

Overview

Awesome OCR

Awesome

This list contains links to great software tools and libraries and literature related to Optical Character Recognition (OCR).

Contributions are welcome, as is feedback.

Software

OCR engines

  • tesseract - The definitive Open Source OCR engine Apache 2.0
  • EasyOCR - OCR engine built on PyTorch by JaidedAI, Apache 2.0
  • ocropus - OCR engine based on LSTM, Apache 2.0
  • ocropus 0.4 - Older v0.4 state of Ocropus, with tesseract 2.04 and iulib, C++
  • kraken - Ocropus fork with sane defaults
  • gocr - OCR engine under the GNU Public License led by Joerg Schulenburg.
  • Ocrad - The GNU OCR. GPL
  • ocular - Machine-learning OCR for historic documents
  • SwiftOCR - fast and simple OCR library written in Swift
  • attention-ocr - OCR engine using visual attention mechanisms
  • RWTH-OCR - The RWTH Aachen University Optical Character Recognition System
  • simple-ocr-opencv and its fork - A simple pythonic OCR engine using opencv and numpy
  • Calamari - OCR Engine based on OCRopy and Kraken

Older and possibly abandoned OCR engines

  • Clara OCR - Open source OCR in C GPL
  • Cuneiform - CuneiForm OCR was developed by Cognitive Technologies
  • Eye - an experimental Java OCR (image-to-text) application
  • kognition - An omnifont OCR software for KDE
  • OCRchie - Modular Optical Character Recognition Software
  • ocre - o.c.r. easy
  • xplab - A GTK 2 tool for pattern matching
  • hebOCR - Hebrew character recognition library (previously named hocr, see Wikipedia article) GPL

OCR file formats

hOCR

  • hocr-tools - Tools for doing various useful things with hOCR files, Apache 2.0
  • hocr-spec - hOCR 1.2 specification
  • ocr-transform - CLI tool to convert between hOCR and ALTO, MIT
  • hocr-parser - hOCR Specification Python Parser
  • hOCRTools - hOCR to ALTO conversion XSLT

ALTO XML

TEI

  • TEI-OCR - TEI customization for OCR generated layout and content information
  • TEI SIG on Libraries - Best Practices for TEI in Libraries
  • GDZ - METS/TEI-based GDZ document format

PAGE XML

  • PAGE-XML Schema - XML schema of the PAGE XML format along with documentation and examples
  • omni:us Pages Format (OPF) - XML schema very similar to PAGE XML that has some additional features.
  • py-pagexml - Python library for handling PAGE XML and OPF files.

OCR CLI

  • OCRmyPDF - OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
  • Pdf2PdfOCR - A tool to OCR a PDF (or supported images) and add a text "layer" (a "pdf sandwich") in the original file making it a searchable PDF. GUI included. Tesseract and cuneiform supported.
  • Ocrocis - Project manager interface for Ocropy, see also external project homepage
  • tesseract-recognize - Tesseract-based tool that outputs result in Page XML format (docker image).

OCR GUI

  • moz-hocr-editor - Firefox Addon for editing hOCR files Discontinued
  • qt-box-editor - QT4 editor of tesseract-ocr box files.
  • ocr-gt-tools - Client-Server application for editing OCR ground truth.
  • Paperwork - Using scanners and OCR to grep paper documents the easy way.
  • Paperless - Scan, index, and archive all of your paper documents.
  • gImageReader - gImageReader is a simple Gtk/Qt front-end to tesseract-ocr.
  • VietOCR - A Java/.NET GUI frontend for Tesseract OCR engine, including jTessBoxEditor a graphical Tesseract box data editor
  • PoCoTo - Fast interactive batch corrections of complete OCR error series in OCR'ed historical documents.
  • OCRFeeder - GTK graphical user interface that allows the users to correct characters or bounding boxes, ODT export and more.
  • PRImA PAGE Viewer - Java based viewer for PAGE XML files (layout + text content). Also supports ALTO XML, FineReader XML, and HOCR.
  • LAREX - A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.
  • archiscribe - Web application for transcribing OCR ground truth from Archive.org. Deployed instance available at https://archiscribe.jbaiter.de/, results are available in @jbaiter/archiscribe-corpus.
  • nw-page-editor - Simple app for visual editing of Page XML files. Provides desktop and server docker-based versions.

OCR Preprocessing

OCR as a Service

OCR evaluation

OCR libraries by programming language

Go

  • gosseract - Golang OCR library, wrapping Tesseract-ocr.

Java

  • Tess4J - Java Native Access bindings to Tesseract.
  • tess-two - Tools for compiling Tesseract on Android and Java API.

.Net

Object Pascal

PHP

Python

  • pytesseract - A Python wrapper for Google Tesseract.
  • pyocr - A Python wrapper for Tesseract and Cuneiform.
  • ocrodjvu - A library and standalone tool for doing OCR on DjVu documents, wrapping Cuneiform, gocr, ocrad, ocropus and tesseract
  • tesserocr - A Python wrapper for the tesseract-ocr API

Javascript

  • ocracy - pure javascript lstm rnn implementation based on ocropus
  • gocr.js - Javascript port (emscripten) of gocr
  • ocrad.js - Javascript port (emscripten) of ocrad
  • tesseract.js - Javascript port (emscripten) of Tesseract
  • node-tesseract-ocr - A simple wrapper for the Tesseract OCR package.
  • node-tesseract-native - C++ module for node providing OCR with tesseract and leptonica.

Ruby

  • rtesseract - Ruby library wrapping the tesseract and imagemagick executables.
  • ruby-tesseract - Native Tesseract bindings for Ruby MRI and JRuby
  • ocr_space - API wrapper for free ocr service ocr.space. Includes CLI

Rust

  • tesseract.rs - Rust bindings for tesseract OCR.
  • leptess - Productive and safe Rust bindings/wrappers for tesseract and leptonica.

R

Swift

  • Tesseract OCR iOS - Swift and Objective-C wrapper for Tesseract OCR.
  • SwiftOCR - Fast and simple OCR library written in Swift. Optimized for recognizing short, one line long alphanumeric codes.

OCR training tools

  • glyph-miner - A system for extracting glyphs from early typeset prints
  • ocrodeg - Document image degradation for OCR data augmentation

Datasets

Ground Truth

  • Rescribe - Transcriptions of Caroline Minuscule Manuscripts PDM 1.0

Literature

OCR-related publication and link lists

Blog Posts and Tutorials

OCR Showcases

  • abbyy-finereader-ocr-senate - Using OCR to parse scanned Senate Financial Disclosure forms.
  • cvOCR - An OCR system for recognizing resume or cv text, implemented in Python and C and based on tesseract
  • MathOCR - A printed scientific document recognition system, pre-alpha

Academic articles

2011 and before

2012

2013

2014

2015

2016

2017

2018

Owner
Konstantin Baierer
Ⓐ ಥ_ಥ (╯°□°)╯︵ ┻━┻ ★。・:*¯\_(ツ)_/¯*:・゚★
Konstantin Baierer
Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Scene Text-Spotting based on PSEnet+CRNN Pytorch implementation of an end to end Text-Spotter with a PSEnet text detector and CRNN text recognizer. We

azhar shaikh 62 Oct 10, 2022
Dirty, ugly, and hopefully useful OCR of Facebook Papers docs released by Gizmodo

Quick and Dirty OCR of Facebook Papers Gizmodo has been working through the Facebook Papers and releasing the docs that they process and review. As lu

Bill Fitzgerald 2 Oct 28, 2021
Using python libraries to track hands

Python-HandTracking Using python libraries to track hands on a camera Uses cv2 and mediapipe libraries custom hand tracking module PyCharm IDE Final E

Martin Matsudaira 1 Dec 17, 2021
Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE

EAST_ICPR: EAST for ICPR MTWI 2018 CHALLENGE Introduction This is a repository forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE. Origin Reposi

Haozheng Li 157 Aug 23, 2022
This is a GUI program which consist of 4 OpenCV projects

Tkinter-OpenCV Project Using Tkinter, Opencv, Mediapipe This is a python GUI program using Tkinter which consist of 4 OpenCV projects 1. Finger Counte

Arya Bagde 3 Feb 22, 2022
This can be use to convert text in a file to handwritten text.

TextToHandwriting This can be used to convert text to handwriting. Clone this project or download the code. Run TextToImage.py give the filename of th

Ashutosh Mahapatra 2 Feb 06, 2022
Document Layout Analysis Projects

Layout_Analysis Introduction This is an implementation of RLSA and X-Y Cut with OpenCV Dependencies OpenCV 3.0+ How to use Compile with g++ : g++ -std

22 Dec 08, 2022
Some codes from PyImageSearch course's and external projects.

👨‍💻 Some codes and projects 👨‍💻 💡 Technologies 📜 Projects 📍 Chrome Dinosaur Controller 📦 Script 📍 Coins Counter 📦 Script 🤓 Author Lucas Biv

Lucas Bivar 25 Oct 24, 2021
This tool will help you convert your text to handwriting xD

So your teacher asked you to upload written assignments? Hate writing assigments? This tool will help you convert your text to handwriting xD

Saurabh Daware 4.2k Jan 07, 2023
A real-time dolly zoom camera effect

Dolly-Zoom I've always been amazed by the gradual perspective change of dolly zoom, and I have some experience in python and OpenCV, so I decided to c

Dylan Kai Lau 52 Dec 08, 2022
Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts

LayoutAnalysisEvaluator Layout Analysis Evaluator for: ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records ICD

17 Dec 08, 2022
Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

SynthText Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Ved

Ankush Gupta 1.8k Dec 28, 2022
This is the open source implementation of the ICLR2022 paper "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis"

StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image

Meta Research 840 Dec 26, 2022
color detection using python

colordetection color detection using python In this color detection Python project, we are going to build an application through which you can automat

Ruchith Kumar 1 Nov 04, 2021
Controlling Volume by Hand Gestures

This program allows the user to control the volume of their device with specific hand gestures involving their thumb and index finger!

Riddhi Bajaj 1 Nov 11, 2021
ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

VistaOCR ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data Publications "How to Efficiently Increase Resolutio

ISI Center for Vision, Image, Speech, and Text Analytics 21 Dec 08, 2021
Toolbox for OCR post-correction

Ochre Ochre is a toolbox for OCR post-correction. Please note that this software is experimental and very much a work in progress! Overview of OCR pos

National Library of the Netherlands / Research 117 Nov 10, 2022
End-to-end pipeline for real-time scene text detection and recognition.

Real-time-Scene-Text-Detection-and-Recognition-System End-to-end pipeline for real-time scene text detection and recognition. The detection model use

Fangneng Zhan 89 Aug 04, 2022
Binarize document images

Binarization Binarization for document images Examples Introduction This tool performs document image binarization (i.e. transform colour/grayscale to

QURATOR-SPK 48 Jan 02, 2023
Official code for :rocket: Unsupervised Change Detection of Extreme Events Using ML On-Board :rocket:

RaVAEn The RaVÆn system We introduce the RaVÆn system, a lightweight, unsupervised approach for change detection in satellite data based on Variationa

SpaceML 35 Jan 05, 2023