docstrum

Last update: Dec 13, 2022

Related tags

Computer Vision docstrum

Overview

Docstrum Algorithm

Getting Started

This repo is for developing a Docstrum algorithm presented by O’Gorman (1993).

Disclaimer

This source code is built on top of the work by Chadoliver. Please find the original code from here (https://github.com/chadoliver/cosc428-structor).

Objective

This project aims at segmenting a document image into meaningful components. The domain of image is specified on historical machine-printed/hand-written document image.

Dependencies

python 2.7
Packages:
- numpy
- cv2

Process

Pre-processing Optional for vertical-line removal
- Blurring Bilateral Filtering
- Otsu's thresholding
- Morphological erosion & dilation
- Smoothing (Averaging)
- Static thresholding
Nearest-Neighbor Clustering and Docstrum Plot
Spacing and Orientation Estimation
Determination of Text-lines
Structural Block Determination
Post-processing
- TBD

Evaluation

Citing Docstrum

O'Gorman, L., 1993. The document spectrum for page layout analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(11), pp.1162-1173. pdf.

@article{o1993document,
  title={The document spectrum for page layout analysis},
  author={O'Gorman, Lawrence},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={15},
  number={11},
  pages={1162--1173},
  year={1993},
  publisher={IEEE}
}

Notes

How to remove .DS_Store

find . -name '.DS_Store' -type f -delete

docstrum

Related tags

Overview

Docstrum Algorithm

Getting Started

Disclaimer

Objective

Dependencies

Process

Evaluation

Citing Docstrum

Notes

How to remove .DS_Store

Owner

Chulwoo Mike Pack

text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

governance proposal to make fei redeemable for eth

A Joint Video and Image Encoder for End-to-End Retrieval

Open Source Computer Vision Library

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

This is an API written in python that uses FastAPI. It is a simple API that can detect discord tokens in Images.

OCR engine for all the languages

Basic functions manipulating images using the OpenCV library

Generic framework for historical document processing

The papers published in top-tier AI conferences in recent years.

Lightning Fast Language Prediction 🚀

InverseRenderNet: Learning single image inverse rendering, CVPR 2019.

Autonomous Driving project for Euro Truck Simulator 2

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

Slice a single image into multiple pieces and create a dataset from them

TextBoxes re-implement using tensorflow

Code for the paper: Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution