当前位置:网站首页>Paddleocr image text extraction
Paddleocr image text extraction
2022-04-23 07:22:00 【Lin Jinpeng】
PaddleOCR Image text extraction
demand
A need at work , You need to extract the license plate number in the picture . Pictured , The license plate is in the fixed position of the picture . Start using pytesseract, Especially unfriendly to Chinese recognition , After all, it's a foreigner's thing . Colleagues recommend PaddleOCR, The things developed by Chinese people are different , The recognition accuracy can reach 90% above . But both have common problems , White characters on black background cannot be recognized / The range is too small, the identification is not allowed to wait .
One . Cut the license plate number area
# np.fromfil Construct an array from data in a text or binary file
# cv2.imdecode() Convert the read data ( decode ) In image format ; It is mainly used to recover images from network transmission data
# cv2.IMREAD_UNCHANGED: Read the full picture , Include alpha passageway , You can write directly -1
img = cv2.imdecode(np.fromfile(imgSrc, dtype=np.uint8), cv2.IMREAD_UNCHANGED)
cropImg = img[y1:y2, x1:x2] # The order is up and down about
So Baidu found Zhang Tu , It happens to be black on a white background Writing in the middle , The results magically identify the results 100% The identification is correct . After comparison , I concluded that the white font was unrecognizable , Then invert the small graph .
Two . Process the small picture of the license plate
height, width, deep = cropImg.shape
gray = cv2.cvtColor(cropImg, cv2.COLOR_BGR2GRAY) # cv2.COLOR_BGR2GRAY take BGR Convert format to grayscale image
dst = np.zeros((height, width, 1), np.uint8) # Generate a pure black picture
for i in range(0, height): # Reverse phase Turn to black on white
for j in range(0, width):
grayPixel = gray[i, j]
dst[i, j] = 255 - grayPixel
# After this step , It's done Turn to black on white , But the white low background is not the brightest
# Reuse cv2.threshold To binarize , Make the black part darker , White is whiter
ret, img = cv2.threshold(dst, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
The result of inverting the small graph is as follows , Look carefully and find that the white background is not very white , Reuse cv.threshold To binarize ( Neither black nor white ), By comparing binary images , The sense of hierarchy comes out .
Identify again , The recognition result finally came out , But I found the last one 7 Identified as 2, In principle, such simple words should not be recognized wrong . So I thought that the boundary of the picture used to test and identify is very wide , Writing in the middle , Then the small graph is filled with white background boundary 150 Pixel .
3、 ... and . Fill in the boundary
# cv2.BORDER_CONSTANT Fixed value filling method
imgsrc = cv2.copyMakeBorder(img, 150, 150, 150, 150, cv2.BORDER_CONSTANT, value=[255, 255, 255])
Four . Identification steps
ocr = PaddleOCR(use_angle_cls=True, use_gpu=False) # Use CPU Preloading , no need GPU
text = ocr.ocr(img, cls=True)
result = str(text[0][1][0]).replace(' license plate :', '').upper()
版权声明
本文为[Lin Jinpeng]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230609472920.html
边栏推荐
- winform滚动条美化
- 机器学习 三: 基于逻辑回归的分类预测
- PyTorch 模型剪枝实例教程三、多参数与全局剪枝
- Cancel remote dependency and use local dependency
- 【动态规划】不同路径2
- 【点云系列】PnP-3D: A Plug-and-Play for 3D Point Clouds
- [recommendation for new books in 2021] professional azure SQL managed database administration
- 【2021年新书推荐】Learn WinUI 3.0
- [dynamic programming] different binary search trees
- Gephi tutorial [1] installation
猜你喜欢
图像分类白盒对抗攻击技术总结
1.1 pytorch and neural network
【2021年新书推荐】Professional Azure SQL Managed Database Administration
【3D形状重建系列】Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion
winform滚动条美化
Gee configuring local development environment
[point cloud series] pnp-3d: a plug and play for 3D point clouds
【2021年新书推荐】Learn WinUI 3.0
C language, a number guessing game
1.2 初试PyTorch神经网络
随机推荐
[2021 book recommendation] kubernetes in production best practices
微信小程序 使用wxml2canvas插件生成图片部分问题记录
Reading notes - activity
【2021年新书推荐】Enterprise Application Development with C# 9 and .NET 5
MySQL数据库安装与配置详解
【点云系列】Relationship-based Point Cloud Completion
SSL/TLS应用示例
【2021年新书推荐】Red Hat RHCSA 8 Cert Guide: EX200
ArcGIS license server administrator cannot start the workaround
【動態規劃】不同路徑2
Bottom navigation bar based on bottomnavigationview
How to standardize multidimensional matrix (based on numpy)
【2021年新书推荐】Professional Azure SQL Managed Database Administration
Machine learning III: classification prediction based on logistic regression
The Cora dataset was trained and tested using the official torch GCN
Exploration of SendMessage principle of advanced handler
DCMTK (dcm4che) works together with dicoogle
[recommendation of new books in 2021] practical IOT hacking
torch_ Geometric learning 1, messagepassing
1.2 初试PyTorch神经网络