当前位置:网站首页>Research on open source OCR engine
Research on open source OCR engine
2022-04-23 20:16:00 【_ Carpediem】
About open source OCR Engine research
One 、OCR Engine comparison
1. Optimal graph OCR ( tencent )
https://ai.qq.com/product/ocr.shtml#common
free API:https://api.ai.qq.com/fcgi-bin/ocr/ocr_generalocr
The engine focuses on row recognition , The accuracy of recognition results is as high as 98%, The error rate is low , Output as text .
( There are open and free API, No open source projects found , It seems to be an online service )
2. Tesseract OCR(Google)
https://github.com/tesseract-ocr/tesseract#about
There are three data sets to choose from :
tessdata-best Highest accuracy , The slowest https://github.com/tesseract-ocr/tessdata_best
tessdata Medium accuracy , Medium speed https://github.com/tesseract-ocr/tessdata
tessdata-fast The accuracy is the lowest , The fastest https://github.com/tesseract-ocr/tessdata_fast
advantage : Open source and Chinese language pack , Yes .NET and Java Of demo, Simple and easy to understand , Suitable as an introductory tutorial , Programming is simple , It provides its own method of training samples , You can generate the recognition language library you need . Support bill identification .
shortcoming : For words 、 Numbers 、 English, etc. need to be classified and identified to ensure the accuracy of identification ; Chinese and English digital symbols and other combination recognition , The recognition effect is not ideal , For special symbols, the false recognition rate is high , Prone to garbled code , Recognition is slow . Perform recognition through command line statements , There is no specific interface , The default output is .txt Format .
Recommend index :
3. cnocr+cnstd
https://github.com/breezedeus/cnocr
https://github.com/breezedeus/cnstd
https://zhuanlan.zhihu.com/p/60767671
advantage : Minimalist Chinese OCR Python package , The recognition model currently used is crnn, Good Chinese recognition ability , The error rate is low , The recognition accuracy is about 98.7%.
Recommend index :
4. Tree hole OCR( Tianruo OCR Evolution version )
https://github.com/AnyListen/tools-ocr
advantage : Support table recognition , Character recognition uses the recognition interface developed by various cloud platforms , Therefore, it needs to be connected to the Internet to be used normally ; Use JavaFX Development , It must be installed before use Java8 Running environment ( The full version does not need to be installed Java8)
shortcoming : Highly dependent on the environment .
5. calamari
https://github.com/Calamari-OCR/calamari
The paper :https://arxiv.org/abs/1807.02004
advantage :Calamari Is a new open source OCR Identification software , It uses the most advanced Tensorflow The deep neural network realized (DNN). Pre training model and multi model voting technology are provided . By convolution neural network (CNNS) And long-term memory (LSTM) The customizable network architecture composed of layers passes Graves Connection time classification of et al (CTC) Algorithm training . and GPU The use of greatly reduces the calculation time of training and prediction . Use two different data sets to compare Calamari And OCRopy,OCRopus3 and Tesseract 4 Performance of .Calamari In modern English UW3 Reached on dataset 0.11% Character error rate (CER), In German DTA19 Reached on dataset 0.18% Error rate , Its performance is far better than the results of the above existing open source software .
Used the current OCR The most advanced technology ,CNN+LSTM+CTC+voting.
calamari OCR engine , Use Python3 To write , be based on OCRopy and Kraken structure , Its design makes it easy for you to run , It can also be modularized and embedded into other python Script .
Environment depends on :Python3 、 Tensorflow1.8
Error rate of characters written in English (CER):0.11%
German error rate :0.18%
Be careful : The library is mainly used to identify printed ancient books , There is no experimental explanation for recognition on natural scene images .
Recommend index :
6. GOCR
https://github.com/SureChEMBL/gocr
Command line tools , Yes JS transplant , It can be used in the front end .
7. xNN-OCR
xNN-OCR High precision is specially developed for mobile terminal 、 high efficiency 、 Text recognition engine , Currently, digital scene is supported 、 Scene English 、 Recognition of scene Chinese characters and special symbols .xNN-OCR For the mobile terminal, a set of text detection and text line recognition algorithm framework based on deep learning is developed and optimized , combination xNN Network compression and acceleration capabilities , The detection and recognition model can be compressed to hundreds of K Level , On mid tier and above mobile phones CPU Up to real time ( The highest 15FPS), Can be combined with “ scan ” In the video stream, what you see is what you get .xNN-OCR At present, the scene number can be well recognized on the terminal 、 English and some Chinese characters , Regardless of model size 、 Speed 、 The accuracy has reached the level of industrial application , And comprehensively surpass the recognition based on traditional algorithms OCR End to end application , The comparison has been verified in many practical application projects .
8. Microsoft OCR Library
Windows8.1 Later versions are built-in OCR engine , Can be used for desktop WindowsPhone.
https://github.com/A9T9/Free-OCR-Software
Recommend index :
9. ocropy
https://github.com/tmbarchive/ocropy
Training based OCR engine , After training, you can achieve better than Tesseract Higher accuracy , Project ratio Tesseract Younger , It contains a called OCRopus Layout Analyzer .
Recommend index :
10. ocrad
https://www.gnu.org/software/ocrad/
https://github.com/matiastucci/ionic-ocr-example
Command line tools . Yes JS transplant , It can be used in the front end .
Recommend index :
11. simple-ocr-opencv project
Simple but immature : A use opencv and numpy It's simple pythonic
OCR engine
https://github.com/goncalopp/simple-ocr-opencv
Recommend index :
12. deep_ocr project
https://github.com/JinpengLI/deep_ocr
advantage : be based on caffe The recognition effect of , And the code is better than tesseract It's much shorter .
shortcoming : Not very stable for the time being , Some semantic models need to be added for optimization .
13. Free
Offline OCR Offline Chinese text detection + distinguish SDK
advantage : The error rate is low , Basically, they can correctly identify and maintain the original text layout style , Support vertical character recognition .
Recommend index :
14. ocular
https://github.com/tberg12/ocular/
advantage :Ocular Is the most advanced history OCR System .
Its main feature is : Unsupervised learning of unknown Fonts : Only document images and text corpora are needed . Ability to handle noisy files : Inconsistent inking , spacing , Vertical alignment, etc . Support multilingual documents , Including documents with a large number of word level transcoding . Unsupervised learning of spelling change patterns , Including outdated spelling and printer shorthand . At the same time, it is jointly translated into diplomacy ( written words ) Form and standardized form .
Two 、 summary
in summary , On the existing open source OCR For the engine ,calamari Engine recognition accuracy is relatively high , The error rate is relatively low , Recognition speed is fast , Good recognition performance , It's worth a try .
版权声明
本文为[_ Carpediem]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204210553546368.html
边栏推荐
- [numerical prediction case] (3) LSTM time series electricity quantity prediction, with tensorflow complete code attached
- Video understanding
- Project training of Software College of Shandong University - Innovation Training - network security shooting range experimental platform (V)
- SQL Server connectors by thread pool 𞓜 instructions for dtsqlservertp plug-in
- nc基础用法4
- . Ren -- the intimate artifact in the field of vertical Recruitment!
- 2022 - Data Warehouse - [time dimension table] - year, week and holiday
- 腾讯邱东洋:深度模型推理加速的术与道
- R language uses timeroc package to calculate the multi time AUC value of survival data under competitive risk, uses Cox model and adds covariates, and R language uses the plotauccurve function of time
- 中金财富公司怎么样,开户安全吗
猜你喜欢
Leetcode XOR operation
SQL Server Connectors By Thread Pool | DTSQLServerTP plugin instructions
[talkative cloud native] load balancing - the passenger flow of small restaurants has increased
Sqoop imports tinyint type fields to boolean type
Error reported by Azkaban: Azkaban jobExecutor. utils. process. ProcessFailureException: Process exited with code 127
Azkaban recompile, solve: could not connect to SMTP host: SMTP 163.com, port: 465 [January 10, 2022]
SQL Server Connectors By Thread Pool | DTSQLServerTP 插件使用说明
山东大学软件学院项目实训-创新实训-网络安全靶场实验平台(八)
Building the tide, building the foundation and winning the future -- the successful holding of zdns Partner Conference
. Ren -- the intimate artifact in the field of vertical Recruitment!
随机推荐
selenium. common. exceptions. WebDriverException: Message: ‘chromedriver‘ executable needs to be in PAT
DNS cloud school | analysis of hidden tunnel attacks in the hidden corner of DNS
Local call feign interface message 404
PCA based geometric feature calculation of PCL point cloud processing (52)
[text classification cases] (4) RNN and LSTM film evaluation Tendency Classification, with tensorflow complete code attached
波场DAO新物种下场,USDD如何破局稳定币市场?
Use the rolling division method to find the maximum common divisor of two numbers
【文本分类案例】(4) RNN、LSTM 电影评价倾向分类,附TensorFlow完整代码
Unity general steps for creating a hyper realistic 3D scene
山东大学软件学院项目实训-创新实训-网络安全靶场实验平台(七)
Project training of Software College of Shandong University - Innovation Training - network security shooting range experimental platform (V)
Project training of Software College of Shandong University - Innovation Training - network security shooting range experimental platform (6)
WordPress插件:WP-China-Yes解决国内访问官网慢的方法
程序设计语言基础(2)
php参考手册String(7.2千字)
PHP reference manual string (7.2000 words)
【问题解决】‘ascii‘ codec can‘t encode characters in position xx-xx: ordinal not in range(128)
Don't bother tensorflow learning notes (10-12) -- Constructing a simple neural network and its visualization
antd dropdown + modal + textarea导致的textarea光标不可被键盘控制问题
Comment créer un pass BEP - 20 sur la chaîne BNB