Specificity-preserving RGB-D Saliency Detection

Last update: Jan 08, 2023

Related tags

Deep Learning SPNet

Overview

Specificity-preserving RGB-D Saliency Detection

Authors: Tao Zhou, Huazhu Fu, Geng Chen, Yi Zhou, Deng-Ping Fan, and Ling Shao.

1. Preface

This repository provides code for "Specificity-preserving RGB-D Saliency Detection" ICCV-2021.

2. Overview

2.1. Introduction

RGB-D saliency detection has attracted increasing attention, due to its effectiveness and the fact that depth cues can now be conveniently captured. Existing works often focus on learning a shared representation through various fusion strategies, with few methods explicitly considering how to preserve modality-specific characteristics. In this paper, taking a new perspective, we propose a specificitypreserving network (SP-Net) for RGB-D saliency detection, which benefits saliency detection performance by exploring both the shared information and modality-specific properties (e.g., specificity). Specifically, two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps. A crossenhanced integration module (CIM) is proposed to fuse cross-modal features in the shared learning network, which are then propagated to the next layer for integrating cross-level information. Besides, we propose a multi-modal feature aggregation (MFA) module to integrate the modality-specific features from each individual decoder into the shared decoder, which can provide rich complementary multi-modal information to boost the saliency detection performance. Further, a skip connection is used to combine hierarchical features between the encoder and decoder layers. Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods.

2.2. Framework Overview

Figure 1: The overall architecture of the proposed SP-Net.

2.3. Quantitative Results

2.4. Qualitative Results

Figure 2: Visual comparisons of our method and eight state-of-the-art methods.

3. Proposed Baseline

3.1. Training/Testing

The training and testing experiments are conducted using PyTorch with one NVIDIA Tesla V100 GPU with 32 GB memor.

Configuring your environment (Prerequisites):
- Installing necessary packages: pip install -r requirements.txt.
Downloading necessary data:
- Downloading training dataset (download link (Google Drive)) and move it into ./Data/.
- Downloading testing dataset (download link (Google Drive)) and move it into ./Data/.
- Downloading pretrained weights (download link (Google Drive)) and move it into ./Checkpoint/SPNet/.
Train Configuration:
- After you download training dataset, just run train.py to train our model.
Test Configuration:
- After you download all the pre-trained model and testing dataset, just run test_produce_maps.py to generate the final prediction map, then run test_evaluation_maps.py to obtain the final quantitative results.
- You can also download predicted saliency maps (download link (Google Drive)) and move it into ./Predict_maps/, then then run test_evaluation_maps.py.

3.2 Evaluating your trained model:

Our evaluation is implemented by python, please refer to test_evaluation_maps.py

4. Citation

Please cite our paper if you find the work useful, thanks!

@inproceedings{zhouiccv2021,
	title={Specificity-preserving RGB-D Saliency Detection},
	author={Zhou, Tao and Fu, Huazhu and Chen, Geng and Zhou, Yi and Fan, Deng-Ping and Shao, Ling},
	booktitle={International Conference on Computer Vision (ICCV)},
	year={2021},
}

@inproceedings{zhoucvmj2022,
	title={Specificity-preserving RGB-D Saliency Detection},
	author={Zhou, Tao and Fan, Deng-Ping and Chen, Geng and Zhou, Yi and Fu, Huazhu},
	booktitle={Computational Visual Media},
	year={2022},
}

⬆ back to top

Specificity-preserving RGB-D Saliency Detection

Related tags

Overview

Specificity-preserving RGB-D Saliency Detection

1. Preface

2. Overview

2.1. Introduction

2.2. Framework Overview

2.3. Quantitative Results

2.4. Qualitative Results

3. Proposed Baseline

3.1. Training/Testing

3.2 Evaluating your trained model:

4. Citation

Owner

Tao Zhou

Wafer Fault Detection using MlOps Integration

Python library for science observations from the James Webb Space Telescope

Code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language"

unet for image segmentation

BanditPAM: Almost Linear-Time k-Medoids Clustering

A PyTorch Implementation of SphereFace.

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval

(NeurIPS '21 Spotlight) IQ-Learn: Inverse Q-Learning for Imitation

Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors

Telegram chatbot created with deep learning model (LSTM) and telebot library.

This repository contains all data used for writing a research paper Multiple Object Trackers in OpenCV: A Benchmark, presented in ISIE 2021 conference in Kyoto, Japan.

AI-generated-characters for Learning and Wellbeing

a spacial-temporal pattern detection system for home automation

Single Image Random Dot Stereogram for Tensorflow

STMTrack: Template-free Visual Tracking with Space-time Memory Networks

pip install python-office

CIFAR-10_train-test - training and testing codes for dataset CIFAR-10

Reproduced Code for Image Forgery Detection papers.