An interactive document scanner built in Python using OpenCV

Last update: Feb 12, 2022

Related tags

Overview

Document Scanner

An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive color threshold to clean up the image.

On my test dataset of 280 images, the program correctly detected the corners of the document 92.8% of the time.

This project makes use of the transform and imutils modules from pyimagesearch (which can be accessed here). The UI code for the interactive mode is adapted from poly_editor.py from here.

You can manually click and drag the corners of the document to be perspective transformed:
The scanner can also process an entire directory of images automatically and save the output in an output directory:

Here are some examples of images before and after scan:

Usage

python scan.py (--images 
   
     | --image 
    
     ) [-i]

The -i flag enables interactive mode, where you will be prompted to click and drag the corners of the document. For example, to scan a single image with interactive mode enabled:

python scan.py --image sample_images/desk.JPG -i

Alternatively, to scan all images in a directory without any input:

python scan.py --images sample_images

An interactive document scanner built in Python using OpenCV

Related tags

Overview

Document Scanner

An interactive document scanner built in Python using OpenCV

Here are some examples of images before and after scan:

Usage

Owner

Kushal Shingote

Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"

Text-to-Image generation

Code release for Hu et al., Learning to Segment Every Thing. in CVPR, 2018.

A tool combining EasyOCR and LaMa to automatically detect text and replace it with an inpainted background.

SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition

The Open Source Framework for Machine Vision

Detect and fix skew in images containing text

Tool which allow you to detect and translate text.

一键翻译各类图片内文字

list all open dataset about ocr.

Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

Create single line SVG illustrations from your pictures

Python library to extract tabular data from images and scanned PDFs

🔎 Like Chardet. 🚀 Package for encoding & language detection. Charset detection.

基于图像识别的开源RPA工具，理论上可以支持所有windows软件和网页的自动化

Open Source Differentiable Computer Vision Library for PyTorch

TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法，textBoxes_note记录了之前整理的笔记。

An expandable and scalable OCR pipeline

scantailor - Scan Tailor is an interactive post-processing tool for scanned pages.