基于Paddle框架的PSENet复现

Overview

PSENet-Paddle

基于Paddle框架的PSENet复现

本项目基于paddlepaddle框架复现PSENet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待

AIStudio链接

参考项目:

whai362-PSENet

环境配置

本项目利用AIstudio平台,采用paddlepaddle: 2.0.2-gpu Version,除此之外你需要通过pip install mmcv editdistance Polygon3 pyclipper或者pip install -r requirement.txt来安装依赖包

数据集

本项目已搭载PSENet比赛指定数据集,你可以在此找到搭载的数据集,包含ICDAR2015 Task4以及Total-Text

工程目录

注意到你需要将submitPSENet重命名为PSENet

/home/aistudio/PSENet
|───data(解压的data.zip)
└───config
└───models
└───dataset
└───eval
└───utils
└───compile.sh
└───__init__.py
└───test.py
└───train.py
└───requirement.txt
└───logo.gif

项目配置**

注意:由于aistudio的docker环境并不适配本项目的编译,所以你需要在本地计算机编译完成后上传编译文件,在本地计算机我才用如下配置,你可以使用gcc --versiong++ --version查看配置

AIStudio Local PC
gcc (Ubuntu 7.5.0-3ubuntu1~16.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

可以发现AIStudio的g++版本不适配,注意:你需要相同的架构,系统以及python版本,(Ubuntu)linux-x86_64&python3.7

`./compile.sh` or `bash compile.sh` if come out bash: ./compile.sh: Permission denied

或者直接进入指定目录,手动编译

cd /home/aistudio/PSENet/models/post_processing/pse
python setup.py build_ext --inplace

编译完成后你会在/home/aistudio/PSENet/models/post_processing/pse得到build/temp.linux-x86_64-3.7/pse.o文件和pse.cpython-37m-x86_64-linux-gnu.so文件

注意:本项目已经全部配置完成,这一步无需操作

训练

需要注意的是,在paddlepaddle-2.0.2中并不支持字典数据读取,因此我在/home/aistudio/PSENet/utils/data_loader.py利用迭代器重写了DataLoader这拉慢了数据读取的速度,会导致训练速度略慢,例如在使用psenet_r50_ic15_1024_finetune.py训练一个epoch需要512.4秒,另外paddlepaddle2.0.2暂不支持Identity方法,因此我在/home/aistudio/PSENet/models/utils/fuse_conv_bn.py通过继承Paddle.nn.Layer写了Identity

cd /home/aistudio/PSENet/
python train.py ${CONFIG_FILE}

例如:

cd /home/aistudio/PSENet/
python train.py config/psenet/psenet_r50_ic15_736.py

训练开启时,会生成一个类似/home/aistudio/PSENet/checkpoints/psenet_r50_ic15_1024_finetune的文件夹,里面将保存权重和优化器参数

测试

cd /home/aistudio/PSENet/
python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE}

例如:

cd /home/aistudio/PSENet/
python test.py config/psenet/psenet_r50_ic15_736.py PSENet/PretrainedModel/checkpoint_ic15_736.pdparams

评估

你需要注意的是:测试和评估是递进的,通过测试生成文件后,进行评估

ICDAR 2015

cd /home/aistudio/PSENet/eval
`./eval_ic15.sh` or `bash ./eval_ic15.sh`

你会得到如下类似信息:

Calculated!{"precision": 0.8620689655172413, "recall": 0.7944150216658642, "hmean": 0.826860435980957, "AP": 0}

以下是paddlepaddle预训练模型测试指标

Method Backbone Fine-tuning Scale Config Precision (%) Recall (%) F-measure (%) Model
PSENet ResNet50 N Shorter Side: 736 psenet_r50_ic15_736.py 83.6 74.0 78.5 checkpoint_ic15_736
PSENet ResNet50 N Shorter Side: 1024 psenet_r50_ic15_1024.py 84.4 76.3 80.2 checkpoint_ic15_1024
PSENet ResNet50 Y Shorter Side: 736 psenet_r50_ic15_736_finetune.py 85.3 76.8 80.9 checkpoint_ic15_736_finetune
PSENet ResNet50 Y Shorter Side: 1024 psenet_r50_ic15_1024_finetune.py 86.2 79.4 82.7 checkpoint_ic15_1024_finetune

Total-Text

Text detection

cd /home/aistudio/PSENet/eval
./eval_tt.sh or `bash ./eval_tt.sh`

你会得到如下类似信息:

Precision:_0.8727937336814604_______/Recall:_0.7786751361161512/Hmean:_0.8230524859472805

pb

以下是paddlepaddle预训练模型测试指标

Method Backbone Fine-tuning Config Precision (%) Recall (%) F-measure (%) Model
PSENet ResNet50 N psenet_r50_tt.py 87.3 77.9 82.3 checkpoint_tt
PSENet ResNet50 Y psenet_r50_tt_finetune.py 89.3 79.6 84.2 checkpoint_tt_finetune

速度测试

python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --report_speed

例如:

cd /home/aistudio/PSENet/
python test.py config/psenet/psenet_r50_ic15_736.py PSENet/PretrainedModel/checkpoint_ic15_736.pdparams --report_speed

你会得到如下类似信息

Testing 283/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8
Testing 284/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8
Testing 285/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8
Testing 286/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8

Citation

@inproceedings{wang2019shape,
  title={Shape robust text detection with progressive scale expansion network},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9336--9345},
  year={2019}
}
Owner
QuanHao Guo
master at UESTC
QuanHao Guo
computer vision, image processing and machine learning on the web browser or node.

Image processing and Machine learning labs   computer vision, image processing and machine learning on the web browser or node note Fast Fourier Trans

ryohei tanaka 487 Nov 11, 2022
FOTS Pytorch Implementation

News!!! Recognition branch now is added into model. The whole project has beed optimized and refactored. ICDAR Dataset SynthText 800K Dataset detectio

Ning Lu 599 Dec 19, 2022
A community-supported supercharged version of paperless: scan, index and archive all your physical documents

Paperless-ngx Paperless-ngx is a document management system that transforms your physical documents into a searchable online archive so you can keep,

5.2k Jan 04, 2023
([email protected]) Boosting Co-teaching with Compression Regularization for Label Noise

Nested-Co-teaching ([email protected]) Pytorch implementation of paper "Boosting Co-tea

YINGYI CHEN 41 Jan 03, 2023
Handwritten_Text_Recognition

Deep Learning framework for Line-level Handwritten Text Recognition Short presentation of our project Introduction Installation 2.a Install conda envi

24 Jul 15, 2022
Line based ATR Engine based on OCRopy

OCR Engine based on OCRopy and Kraken using python3. It is designed to both be easy to use from the command line but also be modular to be integrated

948 Dec 23, 2022
Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

SA-AutoAug Scale-aware Automatic Augmentation for Object Detection Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia [Paper] [Bi

Jia Research Lab 182 Dec 29, 2022
ocroseg - This is a deep learning model for page layout analysis / segmentation.

ocroseg This is a deep learning model for page layout analysis / segmentation. There are many different ways in which you can train and run it, but by

NVIDIA Research Projects 71 Dec 06, 2022
kaldi-asr/kaldi is the official location of the Kaldi project.

Kaldi Speech Recognition Toolkit To build the toolkit: see ./INSTALL. These instructions are valid for UNIX systems including various flavors of Linux

Kaldi 12.3k Jan 05, 2023
Perspective recovery of text using transformed ellipses

unproject_text Perspective recovery of text using transformed ellipses. See full writeup at https://mzucker.github.io/2016/10/11/unprojecting-text-wit

Matt Zucker 111 Nov 13, 2022
A post-processing tool for scanned sheets of paper.

unpaper Originally written by Jens Gulden — see AUTHORS for more information. Licensed under GNU GPL v2 — see COPYING for more information. Overview u

27 Dec 07, 2022
Opencv-image-filters - A camera to capture videos in real time by placing filters using Python with the help of the Tkinter and OpenCV libraries

Opencv-image-filters - A camera to capture videos in real time by placing filters using Python with the help of the Tkinter and OpenCV libraries

Sergio Díaz Fernández 1 Jan 13, 2022
Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

PPE ✨ Repository for our CVPR'2022 paper: Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-

Zipeng Xu 34 Nov 28, 2022
Application that instantly translates sign-language to letters.

Sign Language Translator Project Description The main purpose of project is translating sign-language to letters. In accordance with this purpose we d

3 Sep 29, 2022
PyNeuro is designed to connect NeuroSky's MindWave EEG device to Python and provide Callback functionality to provide data to your application in real time.

PyNeuro PyNeuro is designed to connect NeuroSky's MindWave EEG device to Python and provide Callback functionality to provide data to your application

Zach Wang 45 Dec 30, 2022
A simple component to display annotated text in Streamlit apps.

Annotated Text Component for Streamlit A simple component to display annotated text in Streamlit apps. For example: Installation First install Streaml

Thiago Teixeira 312 Dec 30, 2022
Amazing 3D explosion animation using Pygame module.

3D Explosion Animation 💣 💥 🔥 Amazing explosion animation with Pygame. 💣 Explosion physics An Explosion instance is made of a set of Particle objec

Dylan Tintenfich 12 Mar 11, 2022
Text page dewarping using a "cubic sheet" model

page_dewarp Page dewarping and thresholding using a "cubic sheet" model - see full writeup at https://mzucker.github.io/2016/08/15/page-dewarping.html

Matt Zucker 1.2k Dec 29, 2022
Text to QR-CODE

QR CODE GENERATO USING PYTHON Author : RAFIK BOUDALIA. Installation Use the package manager pip to install foobar. pip install pyqrcode Usage from tki

Rafik Boudalia 2 Oct 13, 2021
Hiiii this is the Spanish for Linux and win 10 and in the near future the english version of PortScan my new tool on which you can see what ports are Open only with the IP adress.

PortScanner-by-IIT PortScanner es una herramienta programada en Python3. Como su nombre indica esta herramienta escanea los primeros 150 puertos de re

5 Sep 19, 2022