The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

Related tags

Deep LearningGCoNet
Overview

GCoNet

The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

Trained model

Download final_gconet.pth (Google Drive). And it is the training log.

Put final_gconet.pth at GCoNet/tmp/GCoNet_run1.

Run test.sh for evaluation.

Data Format

Put the DUTS_class (training dataset from GICD), CoCA, CoSOD3k and Cosal2015 datasets to GCoNet/data as the following structure:

GCoNet
   ├── other codes
   ├── ...
   │ 
   └── data
         ├──── images
         |       ├── DUTS_class (DUTS_class's image files)
         |       ├── CoCA (CoCA's image files)
         |       ├── CoSOD3k (CoSOD3k's image files)
         │       └── Cosal2015 (Cosal2015's image files)
         │ 
         └────── gts
                  ├── DUTS_class (DUTS_class's Groundtruth files)
                  ├── CoCA (CoCA's Groundtruth files)
                  ├── CoSOD3k (CoSOD3k's Groundtruth files)
                  └── Cosal2015 (Cosal2015's Groundtruth files)

Usage

Run sh all.sh for training (train_GPU0.sh) and testing (test.sh).

Prediction results

The co-saliency maps of GCoNet can be found at Google Drive.

Note and Discussion

In your training, you can usually obtain slightly worse performance on CoCA dataset and slightly better perofmance on Cosal2015 and CoSOD3k datasets. The performance fluctuation is around 1.0 point for Cosal2015 and CoSOD3k datasets and around 2.0 points for CoCA dataset.

We observed that the results on CoCA dataset are unstable when train the model multiple times, and the performance fluctuation can reach around 1.5 ponits (But our performance are still much better than other methods in the worst case).
Therefore, we provide our used training pairs and sequences with deterministic data augmentation to help you to reproduce our results on CoCA. (In different machines, these inputs and data augmentation are different but deterministic.) However, there is still randomness in the training stage, and you can obtain different performance on CoCA.

There are three possible reasons:

  1. It may be caused by the challenging images of CoCA dataset where the target objects are relative small and there are many non-target objects in a complex environment.
  2. The imperfect training dataset. We use the training dataset in GICD, whose labels are produced by the classification model. There are some noisy labels in the training dataset.
  3. The randomness of training groups. In our training, two groups are randomly picked for training. Different collaborative training groups have different training difficulty.

Possible research directions for performance stability:

  1. Reduce label noise. If you want to use the training dataset in GICD to train your model. It is better to use multiple powerful classification models (ensemble) to obtain better class labels.
  2. Deterministic training groups. For two collaborative image groups, you can explore different ways to pick the suitable groups, e.g., pick two most similar groups for hard example mining.

It is a potential research direction to obtain stable results on such challenging real-world images. We follow other CoSOD methods to report the best performance of our model. You need to train the model multiple times to obtain the best result on CoCA dataset. If you want more discussion about it, you can contact me ([email protected]).

Citation

@inproceedings{fan2021gconet,
title={Group Collaborative Learning for Co-Salient Object Detection},
author={Fan, Qi and Fan, Deng-Ping and Fu, Huazhu and Tang, Chi-Keung and Shao, Ling and Tai, Yu-Wing},
booktitle={CVPR},
year={2021}
}

Acknowledgements

Zhao Zhang gives us lots of helps! Our framework is built on his GICD.

Owner
Qi Fan
Qi Fan
Hso-groupie - A pwnable challenge in Real World CTF 4th

Hso-groupie - A pwnable challenge in Real World CTF 4th

Riatre Foo 42 Dec 05, 2022
LoFTR:Detector-Free Local Feature Matching with Transformers CVPR 2021

LoFTR-with-train-script LoFTR:Detector-Free Local Feature Matching with Transformers CVPR 2021 (with train script --- unofficial ---). About Megadepth

Nan Xiaohu 15 Nov 04, 2022
Prototype-based Incremental Few-Shot Semantic Segmentation

Prototype-based Incremental Few-Shot Semantic Segmentation Fabio Cermelli, Massimiliano Mancini, Yongqin Xian, Zeynep Akata, Barbara Caputo -- BMVC 20

Fabio Cermelli 21 Dec 29, 2022
Code for the paper "Controllable Video Captioning with an Exemplar Sentence"

SMCG Code for the paper "Controllable Video Captioning with an Exemplar Sentence" Introduction We investigate a novel and challenging task, namely con

10 Dec 04, 2022
A best practice for tensorflow project template architecture.

A best practice for tensorflow project template architecture.

Mahmoud Gamal Salem 3.6k Dec 22, 2022
RealFormer-Pytorch Implementation of RealFormer using pytorch

RealFormer-Pytorch Implementation of RealFormer using pytorch. Includes comparison with classical Transformer on image classification task (ViT) wrt C

Simo Ryu 90 Dec 08, 2022
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 1.4k Jan 08, 2023
"Segmenter: Transformer for Semantic Segmentation" reproduced via mmsegmentation

Segmenter-based-on-OpenMMLab "Segmenter: Transformer for Semantic Segmentation, arxiv 2105.05633." reproduced via mmsegmentation. We reproduce Segment

EricKani 22 Feb 24, 2022
Official Python implementation of the 'Sparse deconvolution'-v0.3.0

Sparse deconvolution Python v0.3.0 Official Python implementation of the 'Sparse deconvolution', and the CPU (NumPy) and GPU (CuPy) calculation backen

Weisong Zhao 23 Dec 28, 2022
Python with OpenCV - MediaPip Framework Hand Detection

Python HandDetection Python with OpenCV - MediaPip Framework Hand Detection Explore the docs » Contact Me About The Project It is a Computer vision pa

2 Jan 07, 2022
House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects

House-GAN++ Code and instructions for our paper: House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent

122 Dec 28, 2022
Vehicle Detection Using Deep Learning and YOLO Algorithm

VehicleDetection Vehicle Detection Using Deep Learning and YOLO Algorithm Dataset take or find vehicle images for create a special dataset for fine-tu

Maryam Boneh 96 Jan 05, 2023
This dlib-based facial login system

Facial-Login-System This dlib-based facial login system is a technology capable of matching a human face from a digital webcam frame capture against a

Mushahid Ali 3 Apr 23, 2022
This repository contains the code for the paper ``Identifiable VAEs via Sparse Decoding''.

Sparse VAE This repository contains the code for the paper ``Identifiable VAEs via Sparse Decoding''. Data Sources The datasets used in this paper wer

Gemma Moran 17 Dec 12, 2022
Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Features"

EDM-subgenre-classifier This repository contains the code for "Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Fea

11 Dec 20, 2022
This repo contains the official implementations of EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis

EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis This repo contains the official implementations of EigenDamage: Structured Prunin

Chaoqi Wang 107 Apr 20, 2022
A fast, dataset-agnostic, deep visual search engine for digital art history

imgs.ai imgs.ai is a fast, dataset-agnostic, deep visual search engine for digital art history based on neural network embeddings. It utilizes modern

Fabian Offert 5 Dec 14, 2022
Official PyTorch implementation of "RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on" (IJCAI-ECAI 2022)

RMGN-VITON RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on In IJCAI-ECAI 2022(short oral). [Paper] [Supplementary Material] Abstra

27 Dec 01, 2022
CondenseNet: Light weighted CNN for mobile devices

CondenseNets This repository contains the code (in PyTorch) for "CondenseNet: An Efficient DenseNet using Learned Group Convolutions" paper by Gao Hua

Shichen Liu 690 Nov 30, 2022
Real-Time Multi-Contact Model Predictive Control via ADMM

Here, you can find the code for the paper 'Real-Time Multi-Contact Model Predictive Control via ADMM'. Code is currently being cleared up and optimize

17 Dec 28, 2022