Vietnamese Language Detection and Recognition

Overview
Table of Content
  1. Introduction (Khôi viết)
  2. Dataset (đổi link thui thành 3k5 ảnh mình)
  3. Getting Started (An Viết)
  4. Training & Evaluation (Tấn + Quỳnh viết)
  5. Acknowledgement (đổi link thui)

Dictionary-guided Scene Text Recognition

  • We propose a novel dictionary-guided sense text recognition approach that could be used to improve many state-of-the-art models.
architecture.png
Comparison between the traditional approach and our proposed approach.

Details of the dataset construction, model architecture, and experimental results can be found in our following paper:

@inproceedings{m_Nguyen-etal-CVPR21,
      author = {Nguyen Nguyen and Thu Nguyen and Vinh Tran and Triet Tran and Thanh Ngo and Thien Nguyen and Minh Hoai},
      title = {Dictionary-guided Scene Text Recognition},
      year = {2021},
      booktitle = {Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition (CVPR)},
    }

Please CITE our paper whenever our dataset or model implementation is used to help produce published results or incorporated into other software.


Dataset

We introduce a new VinText dataset.

By downloading this dataset, USER agrees:

  • to use this dataset for research or educational purposes only
  • to not distribute or part of this dataset in any original or modified form.
  • and to cite our paper whenever this dataset are employed to help produce published results.
Name #imgs #text instances Examples
VinText 2000 About 56000 example.png

Detail about VinText dataset can be found in our paper. Download Converted dataset to try with our model

Dataset variant Input format Link download
Original x1,y1,x2,y2,x3,y3,x4,y4,TRANSCRIPT Download here
Converted dataset COCO format Download here

VinText

Extract data and copy folder to folder datasets/

datasets
└───vintext
	└───test.json
		│train.json
		|train_images
		|test_images
└───evaluation
	└───gt_vintext.zip

Getting Started

Requirements
  • python=3.7
  • torch==1.4.0
  • detectron2==0.2
Installation
conda create -n dict-guided -y python=3.7
conda activate dict-guided
conda install -y pytorch torchvision cudatoolkit=10.0 -c pytorch
python -m pip install ninja yacs cython matplotlib tqdm opencv-python shapely scipy tensorboardX pyclipper Polygon3 weighted-levenshtein editdistance

# Install Detectron2
python -m pip install detectron2==0.2 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu100/torch1.4/index.html

Check out the code and install:

git clone https://github.com/nguyennm1024/dict-guided.git
cd dict-guided
python setup.py build develop
Download vintext pre-trained model
Usage

Prepare folders

mkdir sample_input
mkdir sample_output

Copy your images to sample_input/. Output images would result in sample_output/

python demo/demo.py --config-file configs/BAText/VinText/attn_R_50.yaml --input sample_input/ --output sample_output/ --opts MODEL.WEIGHTS path-to-trained_model-checkpoint
qualitative results.png
Qualitative Results on VinText.

Training and Evaluation

Training

For training, we employed the pre-trained model tt_attn_R_50 from the ABCNet repository for initialization.

python tools/train_net.py --config-file configs/BAText/VinText/attn_R_50.yaml MODEL.WEIGHTS path_to_tt_attn_R_50_checkpoint

Example:

python tools/train_net.py --config-file configs/BAText/VinText/attn_R_50.yaml MODEL.WEIGHTS ./tt_attn_R_50.pth

Trained model output will be saved in the folder output/batext/vintext/ that is then used for evaluation

Evaluation

python tools/train_net.py --eval-only --config-file configs/BAText/VinText/attn_R_50.yaml MODEL.WEIGHTS path_to_trained_model_checkpoint

Example:

python tools/train_net.py --eval-only --config-file configs/BAText/VinText/attn_R_50.yaml MODEL.WEIGHTS ./output/batext/vintext/trained_model.pth

Acknowledgement

This repository is built based-on ABCNet

Handwriting Recognition System based on a deep Convolutional Recurrent Neural Network architecture

Handwriting Recognition System This repository is the Tensorflow implementation of the Handwriting Recognition System described in Handwriting Recogni

Edgard Chammas 346 Jan 07, 2023
An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive

Kushal Shingote 1 Feb 12, 2022
This Repository contain Opencv Projects in python

Python-Opencv OpenCV OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was

Yash Sakre 2 Nov 06, 2021
An application of high resolution GANs to dewarp images of perturbed documents

Docuwarp This project is focused on dewarping document images through the usage of pix2pixHD, a GAN that is useful for general image to image translat

Thomas Huang 97 Dec 25, 2022
graph learning code for ogb

The final code for OGB Installation Requirements: ogb=1.3.1 torch=1.7.0 torch-geometric=1.7.0 torch-scatter=2.0.6 torch-sparse=0.6.9 Baseline models T

PierreHao 20 Nov 10, 2022
Simple SDF mesh generation in Python

Generate 3D meshes based on SDFs (signed distance functions) with a dirt simple Python API.

Michael Fogleman 1.1k Jan 08, 2023
End-to-end pipeline for real-time scene text detection and recognition.

Real-time-Scene-Text-Detection-and-Recognition-System End-to-end pipeline for real-time scene text detection and recognition. The detection model use

Fangneng Zhan 89 Aug 04, 2022
Convolutional Recurrent Neural Networks(CRNN) for Scene Text Recognition

CRNN_Tensorflow This is a TensorFlow implementation of a Deep Neural Network for scene text recognition. It is mainly based on the paper "An End-to-En

MaybeShewill-CV 1000 Dec 27, 2022
Text to QR-CODE

QR CODE GENERATO USING PYTHON Author : RAFIK BOUDALIA. Installation Use the package manager pip to install foobar. pip install pyqrcode Usage from tki

Rafik Boudalia 2 Oct 13, 2021
A Python wrapper for Google Tesseract

Python Tesseract Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and "read" the text embedded i

Matthias A Lee 4.6k Jan 06, 2023
CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

LED2-Net This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering". Y

Fu-En Wang 83 Jan 04, 2023
CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

CRAFT-Reimplementation Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 . Reimple

453 Dec 28, 2022
This is a implementation of CRAFT OCR method

This is a implementation of CRAFT OCR method

Esaka 0 Nov 01, 2021
Learn computer graphics by writing GPU shaders!

This repo contains a selection of projects designed to help you learn the basics of computer graphics. We'll be writing shaders to render interactive two-dimensional and three-dimensional scenes.

Eric Zhang 1.9k Jan 02, 2023
Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks?

Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks? Artifact Detection/Correction - Offcial PyTorch Implementation This rep

CHOI HWAN IL 23 Dec 20, 2022
ARU-Net - Deep Learning Chinese Word Segment

ARU-Net: A Neural Pixel Labeler for Layout Analysis of Historical Documents Contents Introduction Installation Demo Training Introduction This is the

128 Sep 12, 2022
A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval (arXiv) Repository to contain the code, models, data for end-to-end

225 Dec 25, 2022
A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database. The structure, shape and proportions of the faces are comp

Pavankumar Khot 4 Mar 19, 2022
[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

Grounded Situation Recognition with Transformers Paper | Model Checkpoint This is the official PyTorch implementation of Grounded Situation Recognitio

Junhyeong Cho 18 Jul 19, 2022
Camera Intrinsic Calibration and Hand-Eye Calibration in Pybullet

This repository is mainly for camera intrinsic calibration and hand-eye calibration. Synthetic experiments are conducted in PyBullet simulator. 1. Tes

CAI Junhao 7 Oct 03, 2022