Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

[email protected]">

Last update: Jan 01, 2023

Related tags

Computer Vision DewarpNet

Overview

DewarpNet

This repository contains the codes for DewarpNet training.

Recent Updates

[May, 2020] Added evaluation images and an important note about Matlab SSIM.
[Dec, 2020] Added OCR evaluation details.

Training

Prepare Data: train.txt & val.txt. Contents should be like:

1/824_8-cp_Page_0503-7Ns0001
1/824_1-cp_Page_0504-2Cw0001

Train Shape Network: python trainwc.py --arch unetnc --data_path ./data/DewarpNet/doc3d/ --batch_size 50 --tboard
Train Texture Mapping Network: python trainbm.py --arch dnetccnl --img_rows 128 --img_cols 128 --img_norm --n_epoch 250 --batch_size 50 --l_rate 0.0001 --tboard --data_path ./DewarpNet/doc3d

Inference:

Run: python infer.py --wc_model_path ./eval/models/unetnc_doc3d.pkl --bm_model_path ./eval/models/dnetccnl_doc3d.pkl --show

Evaluation (Image Metrics):

We use the same evaluation code as DocUNet. To reproduce the quantitative results reported in the paper use the images available here.
[Important note about Matlab version] We noticed that Matlab 2020a uses a different SSIM implementation which gives a better MS-SSIM score (0.5623). Whereas we have used Matlab 2018b. Please compare the scores according to your Matlab version.

Evaluation (OCR Metrics):

The 25 images used for OCR evaluation is /eval/ocr_eval/ocr_files.txt
The corresponding ground-truth text is given in /eval/ocr_eval/tess_gt.json
For the OCR errors reported in the paper we had used cv2.blur as pre-processing which gives higher error in all the cases. For convenience, we provide the updated numbers (without using blur) in the following table:

Method	ED	CER	ED (no blur)	CER (no blur)
DocUNet	1975.86	0.4656(0.263)	1671.80	0.403 (0.256)
DocUNet on Doc3D	1684.34	0.3955 (0.272)	1296.00	0.294 (0.235)
DewarpNet	1288.60	0.3136 (0.248)	1007.28	0.249 (0.236)
DewarpNet (ref)	1114.40	0.2692 (0.234)	812.48	0.204 (0.228)

We had used the Tesseract (v4.1.0) default configuration for evaluation with PyTesseract (v0.2.6).

Models:

Pre-trained models are available here. These models are captured prior to end-to-end training, thus won't give you the end-to-end results reported in Table 2 of the paper. Use the images provided above to get the exact numbers as Table 2.

Dataset:

The doc3D dataset can be downloaded using the scripts here.

More Stuff:

Citation:

If you use the dataset or this code, please consider citing our work-

@inproceedings{SagnikKeICCV2019, 
Author = {Sagnik Das*, Ke Ma*, Zhixin Shu, Dimitris Samaras, Roy Shilkrot}, 
Booktitle = {Proceedings of International Conference on Computer Vision}, 
Title = {DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks}, 
Year = {2019}}

Acknowledgements:

These codes are heavily structured on pytorch-semseg.

Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

Related tags

Overview

DewarpNet

Recent Updates

Training

Inference:

Evaluation (Image Metrics):

Evaluation (OCR Metrics):

Models:

Dataset:

More Stuff:

Citation:

Acknowledgements:

Owner

[email protected]

Generate a list of papers with publicly available source code in the daily arxiv

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

deployment of a hybrid model for automatic weapon detection/ anomaly detection for surveillance applications

Balabobapy - Using artificial intelligence algorithms to continue the text

A pure pytorch implemented ocr project including text detection and recognition

An Optical Character Recognition system using Pytesseract/Extracting data from Blood Pressure Reports.

Face Anonymizer - FaceAnonApp v1.0

Markup for note taking

A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

This repository provides train＆test code, dataset, det.&rec. annotation, evaluation script, annotation tool, and ranking.

Camera Intrinsic Calibration and Hand-Eye Calibration in Pybullet

Image augmentation library in Python for machine learning.

Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

A toolbox of scene text detection and recognition

🖺 OCR using tensorflow with attention

LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

Volume Control using OpenCV

轻量级公式 OCR 小工具：一键识别各类公式图片，并转换为 LaTeX 格式

code for our ICCV 2021 paper "DeepCAD: A Deep Generative Network for Computer-Aided Design Models"

Simple SDF mesh generation in Python