利用Paddle框架复现CRAFT

Last update: Mar 07, 2022

Related tags

Computer Vision CRAFT-Paddle

Overview

CRAFT-Paddle

利用Paddle框架复现CRAFT

CRAFT

本项目基于paddlepaddle框架复现CRAFT，并参加百度第三届论文复现赛，将在2021年5月15日比赛完后提供AIStudio链接～敬请期待

参考项目：

CRAFT: Character-Region Awareness For Text detection

项目配置

pip install -r requirements.txt

你应该具有以下目录

/home/aistudio/CRAFT(工程目录)
/home/aistudio/Data(数据集文件)

数据集文件已挂载，自行解压即可

训练

The code for training is not included in this repository, and we cannot release the full training code for IP reason.

作者并未提供训练代码

权重转换

这里用到了X2Paddle神器，转换代码如下，具体使用文档参见X2Paddle

from craft import CRAFT
import torch
from collections import OrderedDict
import imgproc
import numpy as np
import cv2

def copyStateDict(state_dict):
    if list(state_dict.keys())[0].startswith("module"):
        start_idx = 1
    else:
        start_idx = 0
    new_state_dict = OrderedDict()
    for k, v in state_dict.items():
        name = ".".join(k.split(".")[start_idx:])
        new_state_dict[name] = v
    return new_state_dict

# 构建输入
input_data = np.random.rand(1, 3, 736, 1280).astype("float32")
net = CRAFT()
net.load_state_dict(copyStateDict(torch.load('craft_mlt_25k.pth')))
net = net.cuda()
net.eval()

# 进行转换
from x2paddle.convert import pytorch2paddle
pytorch2paddle(net, 
          save_dir="paddlemodel", 
          jit_type="trace", 
          input_examples=[torch.tensor(input_data).cuda()])

完成后你会出现如下文件目录

/home/aistudio/CRAFT/paddlemodel
└───inference_model
└──────model.pdiparams
└──────model.pdiparams.info
└──────model.pdmodel
└───model.pdparams
└───x2paddle_code.py

使用同样的方式转换refinenet

测试

模型下载

提取码：4yy1

AIStudio链接

cd /home/aistudio/CRAFT
python test.py

Model name	Used datasets	Languages	Purpose	Model Link
General	SynthText, IC13, IC17	Eng + MLT	For general purpose	craft_mlt_25k
IC15	SynthText, IC15	Eng	For IC15 only	craft_ic15_20k
LinkRefiner	CTW1500	-	Used with the General Model	craft_refiner_CTW1500

下图是实际测试效果

评估

可以采用以下代码进行评估

cd /home/aistudio/CRAFT
python eval.py
cd /home/aistudio/CRAFT/outputs/submit_ic15/
zip ../submit_ic15.zip *
cd /home/aistudio/CRAFT/eval
`./eval_ic15.sh` or `bash eval_ic15.sh`

Method	Dataset	Backbone	refiner	Precision (%)	Recall (%)	F-measure (%)	Model
basenet	ICDAR2015	VGG16_BN	N	82.2	77.9	80.0	craft_ic15_20k
basenet	ICDAR2015	VGG16_BN	N	85.1	79.4	82.2	craft_mlt_25k
basenet	ICDAR2015	VGG16_BN	Y	61.9	45.1	52.2	craft_ic15_20k
basenet	ICDAR2015	VGG16_BN	Y	63.1	43.3	51.4	craft_mlt_25k

评估total_text数据集可参见我的PSNET项目eval文件价下的评估代码

关于作者

姓名	郭权浩
学校	电子科技大学研2020级
研究方向	计算机视觉
主页	Deep Hao的主页
如有错误，请及时留言纠正，非常蟹蟹！
后续会有更多论文复现系列推出，欢迎大家有问题留言交流学习，共同进步成长！

利用Paddle框架复现CRAFT

Related tags

Overview

CRAFT-Paddle

CRAFT

项目配置

训练

权重转换

测试

评估

关于作者

Owner

QuanHao Guo

Omdena-abuja-anpd - Automatic Number Plate Detection for the security of lives and properties using Computer Vision.

Web interface for browsing arXiv papers

Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

Extract tables from scanned image PDFs using Optical Character Recognition.

PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV)

An Implementation of the FOTS: Fast Oriented Text Spotting with a Unified Network

Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

This repository contains codes on how to handle mouse event using OpenCV

【Auto】原神⭐钓鱼辅助工具 | 自动收竿、校准游标 | ✨您只需要抛出鱼竿，我们会帮你完成一切✨

MXNet OCR implementation. Including text recognition and detection.

Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

OpenGait is a flexible and extensible gait recognition project

Sort By Face

A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

nofacedb/faceprocessor is a face recognition engine for NoFaceDB program complex.

CNN+LSTM+CTC based OCR implemented using tensorflow.

(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds