Datasets and source code for our paper Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach

Overview

Introduction

Datasets and source code for our paper Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach


Datasets: WebFG-496 & WebiNat-5089

WebFG-496

WebFG-496 contains 200 subcategories of the "Bird" (Web-bird), 100 subcategories of the Aircraft" (Web-aircraft), and 196 subcategories of the "Car" (Web-car). It has a total number of 53339 web training images.

Download the dataset:

wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-aircraft.tar.gz
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-bird.tar.gz
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-car.tar.gz

WebiNat-5089

WebiNat-5089 is a large-scale webly supervised fine-grained dataset, which consists of 5089 subcategories and 1184520 web training images.

Download the dataset:

wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-iNat.tar.gz.part-00
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-iNat.tar.gz.part-01
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-iNat.tar.gz.part-02
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-iNat.tar.gz.part-03
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-iNat.tar.gz.part-04
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-iNat.tar.gz.part-05
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-iNat.tar.gz.part-06
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-iNat.tar.gz.part-07
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-iNat.tar.gz.part-08
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-iNat.tar.gz.part-09
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-iNat.tar.gz.part-10
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-iNat.tar.gz.part-11
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-iNat.tar.gz.part-12
wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-iNat.tar.gz.part-13

Dataset Briefing

  1. The statistics of popular fine-grained datasets and our datasets. “Supervision" means the training data is manually labeled (“Manual”) or collected from the web (“Web”).

dataset-stats

  1. Detailed construction process of training data in WebFG-496 and WebiNat-5089. “Testing Source” indicates where testing images come from. “Imbalance” is the number of images in the largest class divided by the number of images in the smallest.

dataset-construction_detail

  1. Rough label accuracy of training data estimated by random sampling for WebFG-496 and WebiNat-5089.

dataset-estimated_label_accuracy


Peer-learning model

Network Architecture

The architecture of our proposed peer-learning model is as follows network

Installation

After creating a virtual environment of python 3.5, run pip install -r requirements.txt to install all dependencies

How to use

The code is currently tested only on GPU

  • Data Preparation

    • WebFG-496

      Download data into PLM root directory and decompress them using

      tar -xvf web-aircraft.tar.gz
      tar -xvf web-bird.tar.gz
      tar -xvf web-car.tar.gz
      
    • WebiNat-5089

      Download data into PLM root directory and decompress them using

      cat web-iNat.tar.gz.part-* | tar -zxv
      
  • Source Code

    • If you want to train the whole network from beginning using source code on the WebFG-496 dataset, please follow subsequent steps

      • In Web496_train.sh
        • Modify CUDA_VISIBLE_DEVICES to proper cuda device id.
        • Modify DATA to web-aircraft/web-bird/web-car as needed and then modify N_CLASSES accordingly.
      • Activate virtual environment(e.g. conda) and then run the script
        bash Web496_train.sh
        
    • If you want to train the whole network from beginning using source code on the WebiNat-5089 dataset, please follow subsequent steps

      • Modify CUDA_VISIBLE_DEVICES to proper cuda device id in Web5089_train.sh.
      • Activate virtual environment(e.g. conda) and then run the script
        bash Web5089_train.sh
        
  • Demo

    • If you just want to do a quick test on the model and check the final fine-grained recognition performance on the WebFG-496 dataset, please follow subsequent steps

      • Download one of the following trained models into model/ using
        wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/Models/plm_web-aircraft_bcnn_best-epoch_74.38.pth
        wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/Models/plm_web-bird_bcnn_best-epoch_76.48.pth
        wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/Models/plm_web-car_bcnn_best-epoch_78.52.pth
        
      • Activate virtual environment (e.g. conda)
      • In Web496_demo.sh
        • Modify CUDA_VISIBLE_DEVICES to proper cuda device id.
        • Modify the model name according to the model downloaded.
        • Modify DATA to web-aircraft/web-bird/web-car according to the model downloaded and then modify N_CLASSES accordingly.
      • Run demo using bash Web496_demo.sh
    • If you just want to do a quick test on the model and check the final fine-grained recognition performance on the WebiNat-5089 dataset, please follow subsequent steps

      • Download one of the following trained models into model/ using
        wget https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/Models/plm_web-inat_resnet50_best-epoch_54.56.pth
        
      • Activate virtual environment (e.g. conda)
      • In Web5089_demo.sh
        • Modify CUDA_VISIBLE_DEVICES to proper cuda device id.
        • Modify the model name according to the model downloaded.
      • Run demo using bash Web5089_demo.sh

Results

  1. The comparison of classification accuracy (%) for benchmark methods and webly supervised baselines (Decoupling, Co-teaching, and our Peer-learning) on the WebFG-496 dataset.

network

  1. The comparison of classification accuracy (%) of benchmarks and our proposed webly supervised baseline Peer-learning on the WebiNat-5089 dataset.

network

  1. The comparisons among our Peer-learning model (PLM), VGG-19, B-CNN, Decoupling (DP), and Co-teaching (CT) on sub-datasets Web-aircraft, Web-bird, and Web-car in WebFG-496 dataset. The value on each sub-dataset is plotted in the dotted line and the average value is plotted in solid line. It should be noted that the classification accuracy is the result of the second stage in the two-step training strategy. Since we have trained 60 epochs in the second stage on the basic network VGG-19, we only compare the first 60 epochs in the second stage of our approach with VGG-19

network


Citation

If you find this useful in your research, please consider citing:

@inproceedings{
title={Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach},
author={Zeren Sun, Yazhou Yao, Xiu-Shen Wei, Yongshun Zhang, Fumin Shen, Jianxin Wu, Jian Zhang, Heng Tao Shen},
booktitle={IEEE International Conference on Computer Vision (ICCV)},
year={2021}
}
Phylogeny Partners

Phylogeny-Partners Two states models Instalation You may need to install the cython, networkx, numpy, scipy package: pip install cython, networkx, num

1 Sep 19, 2022
Course about deep learning for computer vision and graphics co-developed by YSDA and Skoltech.

Deep Vision and Graphics This repo supplements course "Deep Vision and Graphics" taught at YSDA @fall'21. The course is the successor of "Deep Learnin

Yandex School of Data Analysis 160 Jan 02, 2023
This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”

This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?” Usage To replicate our results in Secti

Albert Webson 64 Dec 11, 2022
A simple, fast, and efficient object detector without FPN

You Only Look One-level Feature (YOLOF), CVPR2021 A simple, fast, and efficient object detector without FPN. This repo provides an implementation for

789 Jan 09, 2023
Safe Local Motion Planning with Self-Supervised Freespace Forecasting, CVPR 2021

Safe Local Motion Planning with Self-Supervised Freespace Forecasting By Peiyun Hu, Aaron Huang, John Dolan, David Held, and Deva Ramanan Citing us Yo

Peiyun Hu 90 Dec 01, 2022
Code for "Optimizing risk-based breast cancer screening policies with reinforcement learning"

Tempo: Optimizing risk-based breast cancer screening policies with reinforcement learning Introduction This repository was used to develop Tempo, as d

Adam Yala 12 Oct 11, 2022
Official code for the ICLR 2021 paper Neural ODE Processes

Neural ODE Processes Official code for the paper Neural ODE Processes (ICLR 2021). Abstract Neural Ordinary Differential Equations (NODEs) use a neura

Cristian Bodnar 50 Oct 28, 2022
Have you ever wondered how cool it would be to have your own A.I

Have you ever wondered how cool it would be to have your own A.I. assistant Imagine how easier it would be to send emails without typing a single word, doing Wikipedia searches without opening web br

Harsh Gupta 1 Nov 09, 2021
FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation

This repository contains the code accompanying the paper " FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation" Paper link: R

20 Jun 29, 2022
An open source library for face detection in images. The face detection speed can reach 1000FPS.

libfacedetection This is an open source library for CNN-based face detection in images. The CNN model has been converted to static variables in C sour

Shiqi Yu 11.4k Dec 27, 2022
The Unsupervised Reinforcement Learning Benchmark (URLB)

The Unsupervised Reinforcement Learning Benchmark (URLB) URLB provides a set of leading algorithms for unsupervised reinforcement learning where agent

259 Dec 26, 2022
Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

CoProtector Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Zhensu Sun 1 Oct 26, 2021
This repository contains the code for TABS, a 3D CNN-Transformer hybrid automated brain tissue segmentation algorithm using T1w structural MRI scans

This repository contains the code for TABS, a 3D CNN-Transformer hybrid automated brain tissue segmentation algorithm using T1w structural MRI scans. TABS relies on a Res-Unet backbone, with a Vision

6 Nov 07, 2022
Learning from graph data using Keras

Steps to run = Download the cora dataset from this link : https://linqs.soe.ucsc.edu/data unzip the files in the folder input/cora cd code python eda

Mansar Youness 64 Nov 16, 2022
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

Phil Wang 272 Dec 23, 2022
Lucid library adapted for PyTorch

Lucent PyTorch + Lucid = Lucent The wonderful Lucid library adapted for the wonderful PyTorch! Lucent is not affiliated with Lucid or OpenAI's Clarity

Lim Swee Kiat 520 Dec 26, 2022
Home for cuQuantum Python & NVIDIA cuQuantum SDK C++ samples

Welcome to the cuQuantum repository! This public repository contains two sets of files related to the NVIDIA cuQuantum SDK: samples: All C/C++ sample

NVIDIA Corporation 147 Dec 27, 2022
DCA - Official Python implementation of Delaunay Component Analysis algorithm

Delaunay Component Analysis (DCA) Official Python implementation of the Delaunay

Petra Poklukar 9 Sep 06, 2022
Code of paper "Compositionally Generalizable 3D Structure Prediction"

Compositionally Generalizable 3D Structure Prediction In this work, We bring in the concept of compositional generalizability and factorizes the 3D sh

Songfang Han 30 Dec 17, 2022
Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

German Bauer 11 Feb 08, 2022