Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Last update: Sep 15, 2022

Related tags

Deep Learning QVR-SimpleDLM

Overview

Value Retrieval with Arbitrary Queries for Form-like Documents

Introduction

Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Environment

CUDA="11.0"
CUDNN="8"
UBUNTU="18.04"

Install

bash install.sh
git clone https://github.com/NVIDIA/apex && cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
pip install .
# under our project root folder
pip install .

Data Preparation

Our model is pre-trained on IIT-CDIP dataset, fine-tuned on FUNSD train set and evaluated on FUNSD test set and INV-CDIP test set.

Download our processed OCR results of IIT-CDIP with hocr_list_addr.txt and put under PRETRAIN_DATA_FOLDER/.
Download our processed FUNSD and INV-CDIP datasets and put under DATA_DIR/.

Reproduce Our Results

Download our model fine-tuned on FUNSD here.
Do inference following

# $MODEL_PATH here is where you save the fine-tuned model.
# DATASET_NAME is FUNSD or INV-CDIP.
bash reproduce_results.sh $MODEL_PATH $DATA_DIR/DATASET_NAME

You should get the following results.

Datasets	Precision	Recall	F1
FUNSD	60.4	60.9	60.7
INV-CDIP	50.5	47.6	49.0

Pre-training

You can skip the following steps by downloading our pre-trained SimpleDLM model here.
Or download layoutlm-base-uncased.
Do pre-training following

# $NUM_GPUS is the number of gpus you want to do the pretraining on. To reproduce the paper's results we recommend to use 8 gpus.
# $MODEL_PATH here is where you save the LayoutLM model.
# $PRETRAIN_DATA_FOLDER is the folder of IIT-CDIP hocr files.

python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS pretraining.py \
--model_name_or_path $MODEL_PATH  --data_dir $PRETRAIN_DATA_FOLDER \
--output_dir $OUTPUT_DIR

Fine-tuning

Do fine-tuning following

# $MODEL_PATH is where you save the pre-trained simpleDLM model.

CUDA_VISIBLE_DEVICES=0 python run_query_value_retrieval.py --model_type simpledlm --model_name_or_path $MODEL_PATH \
--data_dir $DATA_DIR/FUNSD/ --output_dir $OUTPUT_DIR --do_train --evaluate_during_training

Citation

If you find this codebase useful, please cite our paper:

@article{gao2021value,
  title={Value Retrieval with Arbitrary Queries for Form-like Documents},
  author={Gao, Mingfei and Xue, Le and Ramaiah, Chetan and Xing, Chen and Xu, Ran and Xiong, Caiming},
  journal={arXiv preprint arXiv:2112.07820},
  year={2021}
}

Contact

Please send an email to [email protected] or [email protected] if you have questions.

Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Related tags

Overview

Value Retrieval with Arbitrary Queries for Form-like Documents

Introduction

Environment

Install

Data Preparation

Reproduce Our Results

Pre-training

Fine-tuning

Citation

Contact

Owner

Salesforce

Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite.

Orbivator AI - To Determine which features of data (measurements) are most important for diagnosing breast cancer and find out if breast cancer occurs or not.

Hyperbolic Image Segmentation, CVPR 2022

Offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation

DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation

ESTDepth: Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks (CVPR 2021)

CM building dataset Timisoara

Explaining Hyperparameter Optimization via PDPs

Easy way to add GoogleMaps to Flask applications. maintainer: @getcake

A Simulated Optimal Intrusion Response Game

VoxHRNet - Whole Brain Segmentation with Full Volume Neural Network

Code for "OctField: Hierarchical Implicit Functions for 3D Modeling (NeurIPS 2021)"

This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object Tracking with TRansformer.

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

This is the repo for Uncertainty Quantification 360 Toolkit.

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language

Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network

Keras implementation of AdaBound

Repo for parser tensorflow(.pb) and tflite(.tflite)

TensorFlow implementation of "A Simple Baseline for Bayesian Uncertainty in Deep Learning"