Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Overview

Deskew

by Marek Mauder
https://galfar.vevb.net/deskew
https://github.com/galfar/deskew

v1.30 2019-06-07

Overview

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

There are binaries built for these platforms (located in Bin folder): Win64 (deskew.exe), Win32 (deskew32.exe), Linux 64bit (deskew), macOS (deskew-mac), Linux ARMv7 (deskew-arm).

GUI frontend for this CLI tool is available as well (Windows, Linux, and macOS).

License: MIT

Downloads And Releases

https://github.com/galfar/deskew/releases
https://galfar.vevb.net/deskew#downloads

Usage

Usage:
deskew [-o output] [-a angle] [-b color] [..] input
    input:         Input image file
  Options:
    -o output:     Output image file (default: out.png)
    -a angle:      Maximal expected skew angle (both directions) in degrees (default: 10)
    -b color:      Background color in hex format RRGGBB|LL|AARRGGBB (default: black)
  Ext. options:
    -q filter:     Resampling filter used for rotations (default: linear,
                   values: nearest|linear|cubic|lanczos)
    -t a|treshold: Auto threshold or value in 0..255 (default: a)
    -r rect:       Skew detection only in content rectangle (pixels):
                   left,top,right,bottom (default: whole page)
    -f format:     Force output pixel format (values: b1|g8|rgb24|rgba32)
    -l angle:      Skip deskewing step if skew angle is smaller (default: 0.01)
    -g flags:      Operational flags (any combination of):
                   c - auto crop, d - detect only (no output to file)
    -s info:       Info dump (any combination of):
                   s - skew detection stats, p - program parameters, t - timings
    -c specs:      Output compression specs for some file formats. Several specs
                   can be defined - delimited by commas. Supported specs:
                   jXX - JPEG compression quality, XX is in range [1,100(best)]
                   tSCHEME - TIFF compression scheme: none|lzw|rle|deflate|jpeg|g4

  Supported file formats
    Input:  BMP, JPG, PNG, JNG, GIF, DDS, TGA, PBM, PGM, PPM, PAM, PFM, TIF, PSD
    Output: BMP, JPG, PNG, JNG, GIF, DDS, TGA, PGM, PPM, PAM, PFM, TIF, PSD

Notes

For TIFF support in Linux and macOS you need to have libtiff 4.x installed (package is usually called libtiff5).

For macOS you can download prebuilt libtiff binaries here: https://galfar.github.io/store/TiffLibBins-macOS.zip. Just put the files inside the archive to the same folder as deskew-mac executable.

You can find some test images in TestImages folder and scripts to run tests (RunTests.bat and runtests.sh) in Bin. By default scripts just call deskew command but you can pass a different one as a parameter (e.g. runtests.sh deskew-arm).

Bugs, Issues, Proposals

File them here:
https://github.com/galfar/deskew/issues

Version History

v1.30 2019-06-07:

  • fix #15: Better image quality after rotation - better default and also selectable nearest|linear|cubic|lanczos filtering
  • fix #5: Detect skew angle only (no rotation done) - optionally only skew detection
  • fix #17: Optional auto-crop after rotation
  • fix #3: Command line option to set output compression - now for TIFF and JPEG
  • fix #12: Bad behavior when an output is given and no deskewing is needed
  • libtiff in macOS is now picked up also when binaries are put directly in the directory with deskew
  • text output is flushed after every write (Linux/Unix): it used to be flushed only when writing to device but not file/pipe.

v1.25 2018-05-19:

  • fix #6: Preserve DPI measurement system (TIFF)
  • fix #4: Output image not saved in requested format (when deskewing is skipped)
  • dynamic loading of libtiff library - adds TIFF support in macOS when libtiff is installed

v1.21 2017-11-01:

  • fix #8: Cannot compile in Free Pascal 3.0+ (Windows) - Fails to link precompiled LibTiff library
  • fix #7: Windows FPC build fails with Access violation exception when loading certain TIFFs (especially those saved by Windows Photo Viewer etc.)

v1.20 2016-09-01:

  • much faster rotation, especially when background color is set (>2x faster, 2x less memory)
  • can skip deskewing step if detected skew angle is lower than parameter
  • new option for timing of individual steps
  • fix: crash when last row of page is classified as text
  • misc: default back color is now opaque black, new forced output format "rgb24", background color can define also alpha channel, nicer formatting of text output

v1.10 2014-03-04:

  • TIFF support for Win64 and 32/64bit Linux
  • forced output formats
  • fix: output file names were always lowercase
  • fix: preserves resolution metadata (e.g. 300dpi) of input when writing output

v1.00 2012-06-04:

  • background color
  • "area of interest" content rectangle
  • 64bit and Mac OSX support
  • PSD and TIFF (win32) support
  • show skew detection stats and program parameters

v0.95 2010-12-28:

  • Added auto thresholding

v0.90 2010-02-12:

  • Initial version

Compiling Deskew

Deskew is written in Object Pascal. You need Free Pascal or Delphi to recompile it.

Tested Compilers

There are project files for these IDEs:

  1. Lazarus 2.0.10 (deskew.lpi)
  2. Delphi XE + 10.3 (deskew.dproj)

Additionally, there are compile shell/batch scripts for standalone FPC compiler in Scripts folder.

Supported/Tested Platforms

Deskew is precompiled and was tested on these platforms: Win32, Win64, Linux 64bit, macOS 64bit, Linux ARMv7

Source Code

Latest source code can be found here:
https://github.com/galfar/deskew

Dependencies

Vampyre Imaging Library is needed for compilation and it's included in Deskew's repo in Imaging folder.

Comments
  • Detect skew angle only (no rotation done)

    Detect skew angle only (no rotation done)

    Original report by Marek Mauder (Bitbucket: galfar, GitHub: galfar).


    As requested on blog:

    Can you explain how to simply find the angle but not rotate using this tool? Since I’m dealing with archival TIFFs I need to keep the DPI and embedded metadata in place, so I’m thinking I would use ImageMagick to rotate once I have the angle. Thanks.

    Answer:

    For now you could use -l parameter: -l angle: Skip deskewing step if skew angle is smaller And use some large threshold so rotation will always be skipped.

    $deskew -l 80 Sken003.png
    ...
    Preparing input image (Sken003.png) ...
    Calculating skew angle...
    Skew angle found: 0.23
    Skipping deskewing step, skew angle lower than threshold of 80.00
    Done!
    

    For next version I plan to modify this: angle is optional and if omitted rotation is always skipped.

    major DeskewCmdLine proposal 
    opened by galfar 7
  • Please clarify licensing situation

    Please clarify licensing situation

    README says that the license is MIT, but the source files seem to all claim MPL/LGPL. What is actual license of this code? Can you please distribute a license file?

    This came up at AUR: https://aur.archlinux.org/packages/deskew-git/

    opened by ctrlcctrlv 5
  • Parameters on DeskewGui v0.90

    Parameters on DeskewGui v0.90

    Hello,

    After doing some minor testing with the default and the lanczos filters, I agree that the quality under lanczos is noticeably better, and in my old computer, it doesn't take much longer to process (I would say that a couple of seconds per page).

    In conclusion, I'd like to use lanczos from now on, but DeskewGui doesn't allow me, to the best of my knowledge, to include parameters when I call deskew.exe.

    Is there way I can tell the programme that I want to use lanczos?

    Thanks!

    DeskewGui 
    opened by vivadavid 5
  • EImagingError: Error while loading images from file.... Exception message: Access violation

    EImagingError: Error while loading images from file.... Exception message: Access violation

    Original report by Anonymous.


    When trying to deskew landscape images on Windows, the error message:

    EImagingError: Error while loading images from file "name of file" <format: tif> Exception message: Access violation

    appears.

    bug major 
    opened by galfar 4
  • Better image quality after rotation

    Better image quality after rotation

    Original report by Marek Mauder (Bitbucket: galfar, GitHub: galfar).


    Received several "complaints" about images after rotation step to be a bit blurry (compared to doing rotation in ImageMagick etc.).

    Current "speed over quality" rotation algorithm used in Deskew must be replaced/supplemented with another one.

    enhancement DeskewCmdLine critical 
    opened by galfar 3
  • Simple GUI frontend

    Simple GUI frontend

    Original report by Marek Mauder (Bitbucket: galfar, GitHub: galfar).


    I get many requests for "process all files in folder" etc. from people not comfortable with command line, shell scripts etc.

    Simple GUI frontend (to cmd. line Deskew) with batch processing capability would be nice for many people.

    Binaries for Windows, macOS, and Linux required.

    major proposal DeskewGui 
    opened by galfar 3
  • GUI: Option to select sampling filter

    GUI: Option to select sampling filter

    The GUI does not currently have an option for this, and I don't know any other way of doing batch image processing with the CLI tool, the default linear filter option blurs the images too much, so it would be nice to be able to change it in the GUI.

    opened by NebulaOnion 2
  • Support for multipage pdf files

    Support for multipage pdf files

    Hi, thanks for this program. I think it would be useful to support to deskew a whole pdf file with multiple pages. I normally scan books or documents directly into a multipage pdf file, since it is more manageable to only have one file and not one per page. Do you think that this might be within the scope of this program?

    Cheers

    opened by cristobaltapia 2
  • DeskewGui: default window is too big for some common screens

    DeskewGui: default window is too big for some common screens

    Original report by Anonymous.


    The default window for DeskewGui is a little too big. In some screens, in particular laptop screens with 1366x800 pixels, it is somewhat difficult to redimension the window because the window title bar is left off screen. The maximum height should be no bigger than about 600 pixels. Thank you for your nice work.

    bug minor DeskewGui 
    opened by galfar 2
  • Compiling the GUI on Linux?

    Compiling the GUI on Linux?

    Hello @galfar, I'm your AUR maintainer

    I noticed you have a GUI now, but in Scripts there only seems to be a script to compile it on Mac.

    I get:

    deskewgui.lpr(10,3) Fatal: Can't find unit Interfaces used by deskewgui
    

    Is this usable on Linux?

    opened by ctrlcctrlv 1
  • GUI: allow passing extra parameters to CLI

    GUI: allow passing extra parameters to CLI

    As GUI will always lag behind a bit and won't provide all the options CLI can handle it may be useful to allow passing extra parameters directly from GUI to CLI.

    A new text edit in "Advanced options" should take care of it.

    enhancement DeskewGui 
    opened by galfar 1
  • feature request: release for linux arm64

    feature request: release for linux arm64

    It will be awesome to have more releases a) linux ARM64 and b) macOS-arm64. I would like to use this CLI in linux docker running over Apple M1 computer.

    opened by amitm02 2
  • Auto detect content rectangle

    Auto detect content rectangle

    Content rectangle can be auto detected by scanning the image (after thresholding) from the sides and looking where non-white pixels start. If it's fast enough it could be a default setting. If custom content rectangle is passed as a parameter by the user let's not use the detection.

    enhancement DeskewCmdLine 
    opened by galfar 0
  • Options for Specifying Skew Angle

    Options for Specifying Skew Angle

    I am looking for a cmd option that can be used to specify the angle that is being calculated before but i was not able to find it in the given set of options. I am applying some preprocessing on same image but has different preprocessed version. But when I run deskew on both of these images I get different skew angles. I want to use same angle for both versions of same image.

    opened by hamxahbhatti 2
  • Take the background color from the input image

    Take the background color from the input image

    Request came in to extend "-b" background color parameter to take it's value from the input image (edge/corner). https://galfar.vevb.net/wp/projects/deskew/comment-page-2/#comment-184847

    One request if at all possible is can there be an option for -b that auto samples the rgb value of an edge. I currently run an auto-crop script after they are deskewed and some pages have a different background color. This sometimes trips up the auto crop into thinking the added background is an edge. I’m hoping for something a little more dynamic that allows me to use the tool without sorting the material first.

    Hi, you mean something like “look at pixel [0,0] of input image and use it for output background”?

    Yes. “look at pixel [0,0] of input image and use it for output background” would be a great additional feature.

    enhancement 
    opened by galfar 8
Releases(v1.30)
  • v1.30(Jun 18, 2019)

    Command line tool for deskewing scanned documents. Binaries for several platforms and test images included.

    README

    Recent changes:

    v1.30 2019-06-07:

    • fix #15: Better image quality after rotation - better default and also selectable nearest|linear|cubic|lanczos filtering
    • fix #5: Detect skew angle only (no rotation done) - optionally only skew detection
    • fix #17: Optional auto-crop after rotation
    • fix #3: Command line option to set output compression - now for TIFF and JPEG
    • fix #12: Bad behavior when an output is given and no deskewing is needed
    • libtiff in macOS is now picked up also when binaries are put directly in the directory with deskew
    • text output is flushed after every write (Linux/Unix): it used to be flushed only when writing to device but not file/pipe.

    v1.25 2018-05-19:

    • fix #6: Preserve DPI measurement system (TIFF)
    • fix #4: Output image not saved in requested format (when deskewing is skipped)
    • dynamic loading of libtiff library - adds TIFF support in macOS when libtiff is installed
    Source code(tar.gz)
    Source code(zip)
    Deskew-1.30.zip(4.29 MB)
  • gui-v0.90(Jan 4, 2019)

    GUI Frontend for Deskew Command Line Tool

    Now it’s easier to process many files without writing shell scripts. It needs the command line tool which is called for the each input file. You can set the basic and most of the advanced options for deskewing in the GUI.

    Prebuilt executables for Windows and Linux are available in the download – you just place them to the same folder as the command line tool. Version for macOS is a bit more convenient – it’s a self-contained app bundle with CLI tool already inside and all placed in DMG image. You can also set the explicit path to the command line tool in the program itself.

    Source code(tar.gz)
    Source code(zip)
    DeskewGui-0.90.zip(4.11 MB)
  • v1.25(Jan 4, 2019)

    Command line tool for deskewing scanned documents. Binaries for several platforms and test images included.

    Changes since the last release:

    1.25 2018-05-19:

    • fix #6: Preserve DPI measurement system (TIFF)
    • fix #4: Output image not saved in requested format (when deskewing is skipped)
    • dynamic loading of libtiff library - adds TIFF support in macOS when libtiff is installed

    1.21 2017-11-01:

    • fix #8: Cannot compile in Free Pascal 3.0+ (Windows) - Fails to link precompiled LibTiff library
    • fix #7: Windows FPC build fails with Access violation exception when loading certain TIFFs (especially those saved by Windows Photo Viewer etc.)
    Source code(tar.gz)
    Source code(zip)
    deskew-125.zip(4.31 MB)
A python program to block out your face

Readme This is a small program I threw together in about 6 hours to block out your face. It probably doesn't work very well, so be warned. By default,

1 Oct 17, 2021
LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

Murtaza Hassan 815 Dec 29, 2022
A toolbox of scene text detection and recognition

FudanOCR This toolbox contains the implementations of the following papers: Scene Text Telescope: Text-Focused Scene Image Super-Resolution [Chen et a

FudanVIC Team 170 Dec 26, 2022
GDB python tool to pretty print and debug c++ xtensor containers

gdb_xt2np GDB python tool to pretty print, examine, and debug c++ Xtensor containers. Xtensor is a c++ library for scientific computing using multidim

Christopher Burke 4 Oct 29, 2021
The world's simplest facial recognition api for Python and the command line

Face Recognition You can also read a translated version of this file in Chinese 简体中文版 or in Korean 한국어 or in Japanese 日本語. Recognize and manipulate fa

Adam Geitgey 47k Jan 07, 2023
A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval (arXiv) Repository to contain the code, models, data for end-to-end

225 Dec 25, 2022
"Very simple but works well" Computer Vision based ID verification solution provided by LibraX.

ID Verification by LibraX.ai This is the first free Identity verification in the market. LibraX.ai is an identity verification platform for developers

LibraX.ai 46 Dec 06, 2022
Natural language detection

Detect the language of text. What’s so cool about franc? franc can support more languages(†) than any other library franc is packaged with support for

Titus 3.8k Jan 02, 2023
An interactive interface for using OpenCV's GrabCut algorithm for image segmentation.

Interactive GrabCut An interactive interface for using OpenCV's GrabCut algorithm for image segmentation. Setup Install dependencies: pip install nump

Jason Y. Zhang 16 Oct 10, 2022
This repository summarized computer vision theories.

This repository summarized computer vision theories.

3 Feb 04, 2022
The first open-source library that detects the font of a text in a image.

Typefont Typefont is an experimental library that detects the font of a text in a image. Usage Import the main function and invoke it like in the foll

Vasile Pește 1.6k Feb 24, 2022
The CIS OCR PostCorrectionTool

The CIS OCR Post Correction Tool PoCoTo Source code for the Java-based PoCoTo client enabling fast interactive batch corrections of complete OCR error

CIS OCR Group 36 Dec 15, 2022
Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc

11.4k Jan 02, 2023
Some codes from PyImageSearch course's and external projects.

👨‍💻 Some codes and projects 👨‍💻 💡 Technologies 📜 Projects 📍 Chrome Dinosaur Controller 📦 Script 📍 Coins Counter 📦 Script 🤓 Author Lucas Biv

Lucas Bivar 25 Oct 24, 2021
A selectional auto-encoder approach for document image binarization

The code of this repository was used for the following publication. If you find this code useful please cite our paper: @article{Gallego2019, title =

Javier Gallego 89 Nov 18, 2022
Introduction to image processing, most used and popular functions of OpenCV

👀 OpenCV 101 Introduction to image processing, most used and popular functions of OpenCV go here.

Vusal Ismayilov 3 Jul 02, 2022
This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Script_Convertir_PDF_IMG_TXT Este script de pyhton convierte un pdf en Imagen luego utilizando tesseract como motor OCR convierte la Imagen a Texto. p

alebogado 1 Jan 27, 2022
FOTS Pytorch Implementation

News!!! Recognition branch now is added into model. The whole project has beed optimized and refactored. ICDAR Dataset SynthText 800K Dataset detectio

Ning Lu 599 Dec 19, 2022
This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"

SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks (CVPR 2021 Oral) This repository contains the official PyTorch implementation

Shunsuke Saito 235 Dec 18, 2022
Automatic Number Plate Recognition (ANPR) is a highly accurate system capable of reading vehicle number plates without human intervention

ANPR ANPR is therefore the underlying technology used to find a vehicle license/number plate and it, in turn, supplies this information to a next stag

Melih Emin Kılıçoğlu 1 Jan 09, 2022