Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers (arXiv2021)

Last update: Jan 05, 2023

Related tags

Deep Learning Polyp-PVT

Overview

Polyp-PVT

by Bo Dong, Wenhai Wang, Deng-Ping Fan, Jinpeng Li, Huazhu Fu, & Ling Shao.

This repo is the official implementation of "Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers".

1. Introduction

Polyp-PVT is initially described in arxiv.

Most polyp segmentation methods use CNNs as their backbone, leading to two key issues when exchanging information between the encoder and decoder: 1) taking into account the differences in contribution between different-level features; and 2) designing effective mechanism for fusing these features. Different from existing CNN-based methods, we adopt a transformer encoder, which learns more powerful and robust representations. In addition, considering the image acquisition influence and elusive properties of polyps, we introduce three novel modules, including a cascaded fusion module (CFM), a camouflage identification module (CIM), a and similarity aggregation module (SAM). Among these, the CFM is used to collect the semantic and location information of polyps from high-level features, while the CIM is applied to capture polyp information disguised in low-level features. With the help of the SAM, we extend the pixel features of the polyp area with high-level semantic position information to the entire polyp area, thereby effectively fusing cross-level features. The proposed model, named Polyp-PVT , effectively suppresses noises in the features and significantly improves their expressive capabilities.

Polyp-PVT achieves strong performance on image-level polyp segmentation (0.808 mean Dice and 0.727 mean IoU on ColonDB) and video polyp segmentation (0.880 mean dice and 0.802 mean IoU on CVC-300-TV), surpassing previous models by a large margin.

2. Framework Overview

3. Results

3.1 Image-level Polyp Segmentation

3.2 Image-level Polyp Segmentation Compared Results:

We also provide some result of baseline methods, You could download from Google Drive/Baidu Drive [code:nhhv], including our results and that of compared models.

3.3 Video Polyp Segmentation

3.4 Video Polyp Segmentation Compared Results:

We also provide some result of baseline methods, You could download from Google Drive/Baidu Drive [code:33ie], including our results and that of compared models.

4. Usage:

4.1 Recommended environment:

Python 3.8
Pytorch 1.7.1
torchvision 0.8.2

4.2 Data preparation:

Downloading training and testing datasets and move them into ./dataset/, which can be found in this Google Drive/Baidu Drive [code:dr1h].

4.3 Pretrained model:

You should download the pretrained model from Google Drive/Baidu Drive [code:w4vk], and then put it in the './pretrained_pth' folder for initialization.

4.4 Training:

Clone the repository:

git clone https://github.com/DengPingFan/Polyp-PVT.git
cd Polyp-PVT 
bash train.sh

4.5 Testing:

cd Polyp-PVT 
bash test.sh

4.6 Evaluating your trained model:

Matlab: Please refer to the work of MICCAI2020 (link).

Python: Please refer to the work of ACMMM2021 (link).

Please note that we use the Matlab version to evaluate in our paper.

4.7 Well trained model:

You could download the trained model from Google Drive/Baidu Drive [code:9rpy] and put the model in directory './model_pth'.

4.8 Pre-computed maps:

Google Drive/Baidu Drive [code:x3jc]

5. Citation:

@aticle{dong2021PolypPVT,
  title={Polyp-PVT: Polyp Segmentation with PyramidVision Transformers},
  author={Bo, Dong and Wenhai, Wang and Deng-Ping, Fan and Jinpeng, Li and Huazhu, Fu and Ling, Shao},
  journal={arXiv preprint arXiv:2108.06932},
  year={2021}
}

6. Acknowledgement

We are very grateful for these excellent works PraNet, EAGRNet and MSEG, which have provided the basis for our framework.

7. FAQ:

If you want to improve the usability or any piece of advice, please feel free to contact me directly ([email protected]).

8. License

The source code is free for research and education use only. Any comercial use should get formal permission first.

Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers (arXiv2021)

Related tags

Overview

Polyp-PVT

1. Introduction

2. Framework Overview

3. Results

3.1 Image-level Polyp Segmentation

3.2 Image-level Polyp Segmentation Compared Results:

3.3 Video Polyp Segmentation

3.4 Video Polyp Segmentation Compared Results:

4. Usage:

4.1 Recommended environment:

4.2 Data preparation:

4.3 Pretrained model:

4.4 Training:

4.5 Testing:

4.6 Evaluating your trained model:

4.7 Well trained model:

4.8 Pre-computed maps:

5. Citation:

6. Acknowledgement

7. FAQ:

8. License

Owner

Deng-Ping Fan

Faster Convex Lipschitz Regression

The final project of "Applying AI to EHR Data" of "AI for Healthcare" nanodegree - Udacity.

ICON: Implicit Clothed humans Obtained from Normals (CVPR 2022)

Official PyTorch implementation of the NeurIPS 2021 paper StyleGAN3

A simple algorithm for extracting tree height in sparse scene from point cloud data.

ArtEmis: Affective Language for Art

Automatically creates genre collections for your Plex media

Code for GNMR in ICDE 2021

Navigating StyleGAN2 w latent space using CLIP

JORLDY an open-source Reinforcement Learning (RL) framework provided by KakaoEnterprise

2021:"Bridging Global Context Interactions for High-Fidelity Image Completion"

PyTorch implementation of PNASNet-5 on ImageNet

Contenido del curso Bases de datos del DCC PUC versión 2021-2

An unofficial styleguide and best practices summary for PyTorch

Implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".

CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models

Robust Self-augmentation for NER with Meta-reweighting

Implementations for the ICLR-2021 paper: SEED: Self-supervised Distillation For Visual Representation.

The Official PyTorch Implementation of "VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models" (ICLR 2021 spotlight paper)