基于Paddle框架的PSENet复现

Overview

PSENet-Paddle

基于Paddle框架的PSENet复现

本项目基于paddlepaddle框架复现PSENet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待

AIStudio链接

参考项目:

whai362-PSENet

环境配置

本项目利用AIstudio平台,采用paddlepaddle: 2.0.2-gpu Version,除此之外你需要通过pip install mmcv editdistance Polygon3 pyclipper或者pip install -r requirement.txt来安装依赖包

数据集

本项目已搭载PSENet比赛指定数据集,你可以在此找到搭载的数据集,包含ICDAR2015 Task4以及Total-Text

工程目录

注意到你需要将submitPSENet重命名为PSENet

/home/aistudio/PSENet
|───data(解压的data.zip)
└───config
└───models
└───dataset
└───eval
└───utils
└───compile.sh
└───__init__.py
└───test.py
└───train.py
└───requirement.txt
└───logo.gif

项目配置**

注意:由于aistudio的docker环境并不适配本项目的编译,所以你需要在本地计算机编译完成后上传编译文件,在本地计算机我才用如下配置,你可以使用gcc --versiong++ --version查看配置

AIStudio Local PC
gcc (Ubuntu 7.5.0-3ubuntu1~16.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

可以发现AIStudio的g++版本不适配,注意:你需要相同的架构,系统以及python版本,(Ubuntu)linux-x86_64&python3.7

`./compile.sh` or `bash compile.sh` if come out bash: ./compile.sh: Permission denied

或者直接进入指定目录,手动编译

cd /home/aistudio/PSENet/models/post_processing/pse
python setup.py build_ext --inplace

编译完成后你会在/home/aistudio/PSENet/models/post_processing/pse得到build/temp.linux-x86_64-3.7/pse.o文件和pse.cpython-37m-x86_64-linux-gnu.so文件

注意:本项目已经全部配置完成,这一步无需操作

训练

需要注意的是,在paddlepaddle-2.0.2中并不支持字典数据读取,因此我在/home/aistudio/PSENet/utils/data_loader.py利用迭代器重写了DataLoader这拉慢了数据读取的速度,会导致训练速度略慢,例如在使用psenet_r50_ic15_1024_finetune.py训练一个epoch需要512.4秒,另外paddlepaddle2.0.2暂不支持Identity方法,因此我在/home/aistudio/PSENet/models/utils/fuse_conv_bn.py通过继承Paddle.nn.Layer写了Identity

cd /home/aistudio/PSENet/
python train.py ${CONFIG_FILE}

例如:

cd /home/aistudio/PSENet/
python train.py config/psenet/psenet_r50_ic15_736.py

训练开启时,会生成一个类似/home/aistudio/PSENet/checkpoints/psenet_r50_ic15_1024_finetune的文件夹,里面将保存权重和优化器参数

测试

cd /home/aistudio/PSENet/
python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE}

例如:

cd /home/aistudio/PSENet/
python test.py config/psenet/psenet_r50_ic15_736.py PSENet/PretrainedModel/checkpoint_ic15_736.pdparams

评估

你需要注意的是:测试和评估是递进的,通过测试生成文件后,进行评估

ICDAR 2015

cd /home/aistudio/PSENet/eval
`./eval_ic15.sh` or `bash ./eval_ic15.sh`

你会得到如下类似信息:

Calculated!{"precision": 0.8620689655172413, "recall": 0.7944150216658642, "hmean": 0.826860435980957, "AP": 0}

以下是paddlepaddle预训练模型测试指标

Method Backbone Fine-tuning Scale Config Precision (%) Recall (%) F-measure (%) Model
PSENet ResNet50 N Shorter Side: 736 psenet_r50_ic15_736.py 83.6 74.0 78.5 checkpoint_ic15_736
PSENet ResNet50 N Shorter Side: 1024 psenet_r50_ic15_1024.py 84.4 76.3 80.2 checkpoint_ic15_1024
PSENet ResNet50 Y Shorter Side: 736 psenet_r50_ic15_736_finetune.py 85.3 76.8 80.9 checkpoint_ic15_736_finetune
PSENet ResNet50 Y Shorter Side: 1024 psenet_r50_ic15_1024_finetune.py 86.2 79.4 82.7 checkpoint_ic15_1024_finetune

Total-Text

Text detection

cd /home/aistudio/PSENet/eval
./eval_tt.sh or `bash ./eval_tt.sh`

你会得到如下类似信息:

Precision:_0.8727937336814604_______/Recall:_0.7786751361161512/Hmean:_0.8230524859472805

pb

以下是paddlepaddle预训练模型测试指标

Method Backbone Fine-tuning Config Precision (%) Recall (%) F-measure (%) Model
PSENet ResNet50 N psenet_r50_tt.py 87.3 77.9 82.3 checkpoint_tt
PSENet ResNet50 Y psenet_r50_tt_finetune.py 89.3 79.6 84.2 checkpoint_tt_finetune

速度测试

python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --report_speed

例如:

cd /home/aistudio/PSENet/
python test.py config/psenet/psenet_r50_ic15_736.py PSENet/PretrainedModel/checkpoint_ic15_736.pdparams --report_speed

你会得到如下类似信息

Testing 283/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8
Testing 284/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8
Testing 285/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8
Testing 286/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8

Citation

@inproceedings{wang2019shape,
  title={Shape robust text detection with progressive scale expansion network},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9336--9345},
  year={2019}
}
Owner
QuanHao Guo
master at UESTC
QuanHao Guo
Zoom , GoogleMeets에서 Vtuber 데뷔하기

EasyVtuber Facial landmark와 GAN을 이용한 Character Face Generation Google Meets, Zoom 등에서 자신만의 웹툰, 만화 캐릭터로 대화해보세요! 악세사리는 어느정도 추가해도 잘 작동해요! 안타깝게도 RTX 2070

Gunwoo Han 140 Dec 23, 2022
Augmenting Anchors by the Detector Itself

Augmenting Anchors by the Detector Itself Introduction It is difficult to determine the scale and aspect ratio of anchors for anchor-based object dete

4 Nov 06, 2022
pyntcloud is a Python library for working with 3D point clouds.

pyntcloud is a Python library for working with 3D point clouds.

David de la Iglesia Castro 1.2k Jan 07, 2023
The code for “Oriented RepPoints for Aerail Object Detection”

Oriented RepPoints for Aerial Object Detection The code for the implementation of “Oriented RepPoints”, Under review. (arXiv preprint) Introduction Or

WentongLi 207 Dec 24, 2022
A webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV.

Qbr Qbr, pronounced as Cuber, is a webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV. 🌈 Accurate color detection 🔍 Accurate 3x3x

Kim 金可明 502 Dec 29, 2022
Automatically remove the mosaics in images and videos, or add mosaics to them.

Automatically remove the mosaics in images and videos, or add mosaics to them.

Hypo 1.4k Dec 30, 2022
Perspective recovery of text using transformed ellipses

unproject_text Perspective recovery of text using transformed ellipses. See full writeup at https://mzucker.github.io/2016/10/11/unprojecting-text-wit

Matt Zucker 111 Nov 13, 2022
question‘s area recognition using image processing and regular expression

======================================== Paper-Question-recognition ======================================== question‘s area recognition using image p

Yuta Mizuki 7 Dec 27, 2021
~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.

cosc428-structor I had an open-ended Computer Vision assignment to complete, and an out-of-copyright book that I wanted to turn into an ebook. Convent

Chad Oliver 45 Dec 06, 2022
This repository contains codes on how to handle mouse event using OpenCV

Handling-Mouse-Click-Events-Using-OpenCV This repository contains codes on how t

Happy N. Monday 3 Feb 15, 2022
Learn computer graphics by writing GPU shaders!

This repo contains a selection of projects designed to help you learn the basics of computer graphics. We'll be writing shaders to render interactive two-dimensional and three-dimensional scenes.

Eric Zhang 1.9k Jan 02, 2023
Computer vision applications project (Flask and OpenCV)

Computer Vision Applications Project This project is at it's initial phase. This is all about the implementation of different computer vision techniqu

Suryam Thapa 1 Jan 26, 2022
Handwritten Character Recognition using CNN

Handwritten Character Recognition using CNN Problem Definition The main objective of this project is to solve the problem of handwritten character rec

Mohit Kaushik 4 Mar 02, 2022
A simple demo program for using OpenCV on Android

Kivy OpenCV Demo A simple demo program for using OpenCV on Android Build with: buildozer android debug deploy run Run (on desktop) with: python main.p

Andrea Ranieri 13 Dec 29, 2022
SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition

SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition PDF Abstract Explainable artificial intelligence has been gaining attention

87 Dec 26, 2022
The code of "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes"

Mask TextSpotter A Pytorch implementation of Mask TextSpotter along with its extension can be find here Introduction This is the official implementati

Pengyuan Lyu 261 Nov 21, 2022
MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition Python 2.7 Python 3.6 MORAN is a network with rectification mechanism for

Canjie Luo 595 Dec 27, 2022
OCR powered screen-capture tool to capture information instead of images

NormCap OCR powered screen-capture tool to capture information instead of images. Links: Repo | PyPi | Releases | Changelog | FAQs Content: Quickstart

575 Dec 31, 2022
Comparison-of-OCR (KerasOCR, PyTesseract,EasyOCR)

Optical Character Recognition OCR (Optical Character Recognition) is a technology that enables the conversion of document types such as scanned paper

21 Dec 25, 2022
Fun program to overlay a mask to yourself using a webcam

Superhero Mask Overlay Description Simple project made for fun. It consists of placing a mask (a PNG image with transparent background) on your face.

KB Kwan 10 Dec 01, 2022