a micro OCR network with 0.07mb params.

Overview

MicroOCR

a micro OCR network with 0.07mb params.

    Layer (type)               Output Shape         Param #

        Conv2d-1            [-1, 64, 8, 32]           3,136
   BatchNorm2d-2            [-1, 64, 8, 32]             128
          GELU-3            [-1, 64, 8, 32]               0
     ConvBNACT-4            [-1, 64, 8, 32]               0
        Conv2d-5            [-1, 64, 8, 32]             640
   BatchNorm2d-6            [-1, 64, 8, 32]             128
          GELU-7            [-1, 64, 8, 32]               0
     ConvBNACT-8            [-1, 64, 8, 32]               0
        Conv2d-9            [-1, 64, 8, 32]           4,160
  BatchNorm2d-10            [-1, 64, 8, 32]             128
         GELU-11            [-1, 64, 8, 32]               0
    ConvBNACT-12            [-1, 64, 8, 32]               0
   MicroBlock-13            [-1, 64, 8, 32]               0
       Conv2d-14            [-1, 64, 8, 32]             640
  BatchNorm2d-15            [-1, 64, 8, 32]             128
         GELU-16            [-1, 64, 8, 32]               0
    ConvBNACT-17            [-1, 64, 8, 32]               0
       Conv2d-18            [-1, 64, 8, 32]           4,160
  BatchNorm2d-19            [-1, 64, 8, 32]             128
         GELU-20            [-1, 64, 8, 32]               0
    ConvBNACT-21            [-1, 64, 8, 32]               0
   MicroBlock-22            [-1, 64, 8, 32]               0
      Flatten-23              [-1, 64, 256]               0
AdaptiveAvgPool1d-24           [-1, 64, 30]               0
       Linear-25               [-1, 30, 60]           3,900

Total params: 17,276
Trainable params: 17,276
Non-trainable params: 0
Input size (MB): 0.05
Forward/backward pass size (MB): 2.90
Params size (MB): 0.07
Estimated Total Size (MB): 3.02

Script Description

MicroOCR
├── README.md                                   # Descriptions about MicroNet
├── collatefn.py                                # collatefn
├── ctc_label_converter.py                      # accuracy metric for MicroNet
├── dataset.py                                  # Data preprocessing for training and evaluation
├── demo.py                                     # demo
├── gen_image.py                                # generate image for train and eval
├── infer_tool.py                               # inference tool
├── keys.py                                     # character
├── loss.py                                     # Ctcloss definition
├── metric.py                                   # accuracy metric for MicroNet
├── model.py                                    # MicroNet
├── train.py                                    # train the model

Generate data for train and eval

python gen_image.py

Training

python train.py

Inference

python demo.py
Owner
william
william
Camera Intrinsic Calibration and Hand-Eye Calibration in Pybullet

This repository is mainly for camera intrinsic calibration and hand-eye calibration. Synthetic experiments are conducted in PyBullet simulator. 1. Tes

CAI Junhao 7 Oct 03, 2022
Awesome anomaly detection in medical images

A curated list of awesome anomaly detection works in medical imaging, inspired by the other awesome-* initiatives.

Kang Zhou 57 Dec 19, 2022
The virtual calculator will be above the live streaming from your camera

The virtual calculator is above the live streaming from my camera usb , the program first detect my hand and in each frame calculate the distance between two finger ,if the distance is lower than the

gasbaoui mohammed al amine 5 Jul 01, 2022
Run tesseract with the tesserocr bindings with @OCR-D's interfaces

ocrd_tesserocr Crop, deskew, segment into regions / tables / lines / words, or recognize with tesserocr Introduction This package offers OCR-D complia

OCR-D 38 Oct 14, 2022
A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes A PyTorch implement of TextSnake: A Flexible Representation for Detecting

Prince Wang 417 Dec 12, 2022
Fast style transfer

faststyle Faststyle aims to provide an easy and modular interface to Image to Image problems based on feature loss. Install Making sure you have a wor

Lucas Vazquez 21 Mar 11, 2022
Discord QR Scam Code Generator + Token grab mobile device.

A Python script that automatically generates a Nitro scam QR code and grabs the Discord token when scanned.

Visual 9 Nov 22, 2022
kaldi-asr/kaldi is the official location of the Kaldi project.

Kaldi Speech Recognition Toolkit To build the toolkit: see ./INSTALL. These instructions are valid for UNIX systems including various flavors of Linux

Kaldi 12.3k Jan 05, 2023
Introduction to Augmented Reality (AR) with Python 3 and OpenCV 4.2.

Introduction to Augmented Reality (AR) with Python 3 and OpenCV 4.2.

fernanda rodríguez 85 Jan 02, 2023
chineseocr/table_line 表格线检测模型pytorch版

table_line_pytorch chineseocr/table_detct 表格线检测模型table_line pytorch版 原项目github: https://github.com/chineseocr/table-detect 1、模型转换 下载原项目table_detect模型文

1 Oct 21, 2021
Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

Convolutional Recurrent Neural Network This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC l

Baoguang Shi 2k Dec 31, 2022
Motion Detection Squid Game with OpenCV Python

*Motion Detection Squid Game with OpenCV Python i am newbie in python. In this project I made a simple game to follow the trend about the red light gr

Nayan 17 Nov 22, 2022
Binarize document images

Binarization Binarization for document images Examples Introduction This tool performs document image binarization (i.e. transform colour/grayscale to

QURATOR-SPK 48 Jan 02, 2023
This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network Introduction This is a tensorflow re-implementation of PSENet: Shape Robu

Michael liu 498 Dec 30, 2022
Deep LearningImage Captcha 2

滑动验证码深度学习识别 本项目使用深度学习 YOLOV3 模型来识别滑动验证码缺口,基于 https://github.com/eriklindernoren/PyTorch-YOLOv3 修改。 只需要几百张缺口标注图片即可训练出精度高的识别模型,识别效果样例: 克隆项目 运行命令: git cl

Python3WebSpider 117 Dec 28, 2022
An application of high resolution GANs to dewarp images of perturbed documents

Docuwarp This project is focused on dewarping document images through the usage of pix2pixHD, a GAN that is useful for general image to image translat

Thomas Huang 97 Dec 25, 2022
A curated list of papers, code and resources pertaining to image composition

A curated list of resources including papers, datasets, and relevant links pertaining to image composition.

BCMI 391 Dec 30, 2022
POT : Python Optimal Transport

This open source Python library provide several solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning.

Python Optimal Transport 1.7k Jan 04, 2023
MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI.

MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI. It is an open-source and easy-to-install ecosystem that can run locally on a machine with one

Project MONAI 344 Dec 23, 2022
Drowsiness Detection and Alert System

A countless number of people drive on the highway day and night. Taxi drivers, bus drivers, truck drivers, and people traveling long-distance suffer from lack of sleep.

Astitva Veer Garg 4 Aug 01, 2022