Key information extraction from invoice document with Graph Convolution Network

Last update: Dec 16, 2022

Overview

Key Information Extraction from Scanned Invoices

Key information extraction from invoice document with Graph Convolution Network

Related blog post from my Viblo account: https://viblo.asia/p/djeZ1yPGZWz

Models

Background subtraction: U2Net
Image alignment: based-on output of text-detection & cv2
Text detection: CRAFT and an in-house text-detection model
Text recognition: VietOCR and an in-house text-recognition model
KIE: Graph Convolution

Currently, I dont have the invoice-direction classifier model. But you can also develop a model to rotate the image if the image is rotated horizontally or upside down.

Pretrained model

Google Drive

Data

MC-OCR, a Vietnamese receipts dataset: https://aihub.vn/competitions/1
Preprocessed data: Google Drive

Pipeline

TODO

Command

Create virtual environment using conda or virtualenv

# with virtualenv
virtualenv -p python3 invoice_env
# activate environment
source invoice_env/bin/activate
# install prerequisite libraries
pip install -r requirements.txt

# 1st command, run API
make serve
# 2nd command, run web-gui with streamlit
make runapp

Then access the localhost server at: 0.0.0.0:7778

Preview

TODO

Add preprocess data script

Reference

MC-OCR dataset: https://aihub.vn/competitions/1
U2Net: https://github.com/xuebinqin/U-2-Net
CRAFT: https://github.com/clovaai/CRAFT-pytorch
VietOCR: https://github.com/pbcquoc/vietocr
Benchmarking GNNs: https://github.com/graphdeeplearning/benchmarking-gnns
PaddleOCR: https://github.com/PaddlePaddle/PaddleOCR

Key information extraction from invoice document with Graph Convolution Network

Related tags

Overview

Key Information Extraction from Scanned Invoices

Models

Pretrained model

Data

Pipeline

Command

Preview

TODO

Reference

Owner

Phan Hoang

Monocular 3D Object Detection: An Extrinsic Parameter Free Approach (CVPR2021)

The Official PyTorch Implementation of "LSGM: Score-based Generative Modeling in Latent Space" (NeurIPS 2021)

Unofficial Alias-Free GAN implementation. Based on rosinality's version with expanded training and inference options.

Code, pre-trained models and saliency results for the paper "Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images".

An implementation of paper `Real-time Convolutional Neural Networks for Emotion and Gender Classification` with PaddlePaddle.

subpixel: A subpixel convnet for super resolution with Tensorflow

the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.

Nest Protect integration for Home Assistant. This will allow you to integrate your smoke, heat, co and occupancy status real-time in HA.

A light-weight image labelling tool for Python designed for creating segmentation data sets.

PyTorch implementation of DirectCLR from paper Understanding Dimensional Collapse in Contrastive Self-supervised Learning

A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!

JupyterLite demo deployed to GitHub Pages 🚀

This is a simple face recognition mini project that was completed by a team of 3 members in 1 week's time

QQ Browser 2021 AI Algorithm Competition Track 1 1st Place Program

Interactive Visualization to empower domain experts to align ML model behaviors with their knowledge.

The full training script for Enformer (Tensorflow Sonnet) on TPU clusters

Cours d'Algorithmique Appliquée avec Python pour BTS SIO SISR

Torch implementation of various types of GAN (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN, LSGAN)

GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.