Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Last update: Dec 21, 2022

Related tags

Overview

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

This repo is the official implementation of "DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion"

by Peng Sun, Wenhu Zhang, Huanyu Wang, Songyuan Li, and Xi Li.

Prerequisites

Ubuntu 18
PyTorch 1.7.0
CUDA 10.1
Cudnn 7.5.1
Python 3.7
Numpy 1.17.3

Training

Please see launch_train.sh and launch_pretrain.sh for imagenet pretraining and sod training, respectively.

Testing

Please see launch_test.sh for testing on the sod benchmarks.

Main Results

Dataset	E_r	S_λ^mean	F_β^mean	M
DUT-RGBD	0.950	0.921	0.926	0.030
NJUD	0.923	0.903	0.901	0.039
NLPR	0.950	0.918	0.897	0.024
SSD	0.904	0.876	0.852	0.045
STEREO	0.933	0.904	0.898	0.036
LFSD	0.923	0.882	0.882	0.054
RGBD135	0.962	0.920	0.896	0.021

Saliency maps and Evaluation

All of the saliency maps mentioned in the paper are available on GoogleDrive or BaiduYun(code:juc2).

You can use the toolbox provided by jiwei0921 for evaluation.

Additionally, we also provide the saliency maps of the STERE-1000 and SIP dataset on BaiduYun(code:qxfw) for easy comparison.

Dataset	E_r	S_λ^mean	F_β^mean	M
STERE-1000	0.928	0.897	0.895	0.038
SIP	0.908	0.861	0.868	0.057

Citation

@inproceedings{Sun2021DeepRS,
  title={Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion},
  author={P. Sun and Wenhu Zhang and Huanyu Wang and Songyuan Li and Xi Li},
  journal={IEEE Conf. Comput. Vis. Pattern Recog.},
  year={2021}
}

License

The code is released under MIT License (see LICENSE file for details).

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Related tags

Overview

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Prerequisites

Training

Testing

Main Results

Saliency maps and Evaluation

Citation

License

Owner

如今我已剑指天涯

Example how to deploy deep learning model with aiohttp.

Unit-Convertor - Unit Convertor Built With Python

MIMIC Code Repository: Code shared by the research community for the MIMIC-III database

The source code for CATSETMAT: Cross Attention for Set Matching in Bipartite Hypergraphs

Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index.

An OpenAI-Gym Package for Training and Testing Reinforcement Learning algorithms with OpenSim Models

Code for SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes (NeurIPS 2021)

Transfer Learning library for Deep Neural Networks.

Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021.

Implementation of the method proposed in the paper "Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation"

Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch

Official Implementation of "Third Time's the Charm? Image and Video Editing with StyleGAN3" https://arxiv.org/abs/2201.13433

A GPU-optional modular synthesizer in pytorch, 16200x faster than realtime, for audio ML researchers.

Trained on Simulated Data, Tested in the Real World

《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

Experiments with Fourier layers on simulation data.

A demonstration of using a live Tensorflow session to create an interactive face-GAN explorer.

This is an official implementation for "AS-MLP: An Axial Shifted MLP Architecture for Vision".

FL-WBC: Enhancing Robustness against Model Poisoning Attacks in Federated Learning from a Client Perspective

Reinforcement learning library in JAX.