[ICCV 2021] Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain

Overview

Amplitude-Phase Recombination (ICCV'21)

Official PyTorch implementation of "Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain", Guangyao Chen, Peixi Peng, Li Ma, Jia Li, Lin Du, and Yonghong Tian.

Paper: https://arxiv.org/abs/2108.08487

Abstract: Recently, the generalization behavior of Convolutional Neural Networks (CNN) is gradually transparent through explanation techniques with the frequency components decomposition. However, the importance of the phase spectrum of the image for a robust vision system is still ignored. In this paper, we notice that the CNN tends to converge at the local optimum which is closely related to the high-frequency components of the training images, while the amplitude spectrum is easily disturbed such as noises or common corruptions. In contrast, more empirical studies found that humans rely on more phase components to achieve robust recognition. This observation leads to more explanations of the CNN's generalization behaviors in both adversarial attack and out-of-distribution detection, and motivates a new perspective on data augmentation designed by re-combing the phase spectrum of the current image and the amplitude spectrum of the distracter image. That is, the generated samples force the CNN to pay more attention on the structured information from phase components and keep robust to the variation of the amplitude. Experiments on several image datasets indicate that the proposed method achieves state-of-the-art performances on multiple generalizations and calibration tasks, including adaptability for common corruptions and surface variations, out-of-distribution detection and adversarial attack.

Highlights

Fig. 1: More empirical studies found that humans rely on more phase components to achieve robust recognition. However, CNN without effective training restrictions tends to converge at the local optimum related to the amplitude spectrum of the image, leading to generalization behaviors counter-intuitive to humans (the sensitive to various corruptions and the overconfidence of OOD). main hypothesis of the paper

Examples of the importance of phase spectrum to explain the counter-intuitive behavior of CNN

Fig. 2: Four pairs of testing sampless selected from in-distribution CIFAR-10 and OOD SVHN that help explain that CNN capture more amplitude specturm than phase specturm for classification: First, in (a) and (b), the model correctly predicts the original image (1st column in each panel), but the predicts are also exchanged after switching amplitude specturm (3rd column in each panel) while the human eye can still give the correct category through the contour information. Secondly, the model is overconfidence for the OOD samples in (c) and (d). Similarly, after the exchange of amplitude specturm, the label with high confidence is also exchanged.

Fig. 3: Two ways of the proposed Amplitude-Phase Recombination: APR-P and APR-S. Motivated by the powerful generalizability of the human, we argue that reducing the dependence on the amplitude spectrum and enhancing the ability to capture phase spectrum can improve the robustness of CNN.

Citation

If you find our work, this repository and pretrained adversarial generators useful. Please consider giving a star and citation.

@inproceedings{chen2021amplitude,
    title={Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain},
    author={Chen, Guangyao and Peng, Peixi and Ma, Li and Li, Jia and Du, Lin and Tian, Yonghong},
    booktitle={Proceedings of the IEEE International Conference on Computer Vision},
    year={2021}
}

1. Requirements

Environments

Currently, requires following packages

  • python 3.6+
  • torch 1.7.1+
  • torchvision 0.5+
  • CUDA 10.1+
  • scikit-learn 0.22+

Datasets

For even quicker experimentation, there is CIFAR-10-C and CIFAR-100-C. please download these datasets to ./data/CIFAR-10-C and ./data/CIFAR-100-C.

2. Training & Evaluation

To train the models in paper, run this command:

python main.py --aug <augmentations>

Option --aug can be one of None/APR-S. The default training method is APR-P. To evaluate the model, add --eval after this command.

APRecombination for APR-S and mix_data for APR-P can plug and play in other training codes.

3. Results

Fourier Analysis

The standard trained model is highly sensitive to additive noise in all but the lowest frequencies. APR-SP could substantially improve robustness to most frequency perturbations. The code of Heat maps is developed upon the following project FourierHeatmap.

ImageNet-C

  • Results of ResNet-50 models on ImageNet-C:
+(APR-P) +(APR-S) +(APR-SP) +DeepAugMent+(ARP-SP)
mCE 70.5 69.3 65.0 57.5
Owner
Guangyao Chen
Ph.D student @ PKU
Guangyao Chen
Final project code: Implementing MAE with downscaled encoders and datasets, for ESE546 FA21 at University of Pennsylvania

546 Final Project: Masked Autoencoder Haoran Tang, Qirui Wu 1. Training To train the network, please run mae_pretraining.py. Please modify folder path

Haoran Tang 0 Apr 22, 2022
Contains supplementary materials for reproduce results in HMC divergence time estimation manuscript

Scalable Bayesian divergence time estimation with ratio transformations This repository contains the instructions and files to reproduce the analyses

Suchard Research Group 1 Sep 21, 2022
Human annotated noisy labels for CIFAR-10 and CIFAR-100.

Dataloader for CIFAR-N CIFAR-10N noise_label = torch.load('./data/CIFAR-10_human.pt') clean_label = noise_label['clean_label'] worst_label = noise_lab

<a href=[email protected]"> 117 Nov 30, 2022
Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 130+ Indicators

Pandas TA - A Technical Analysis Library in Python 3 Pandas Technical Analysis (Pandas TA) is an easy to use library that leverages the Pandas package

Kevin Johnson 3.2k Jan 09, 2023
This is 2nd term discrete maths project done by UCU students that uses backtracking to solve various problems.

Backtracking Project Sponsors This is a project made by UCU students: Olha Liuba - crossword solver implementation Hanna Yershova - sudoku solver impl

Dasha 4 Oct 17, 2021
a reimplementation of Holistically-Nested Edge Detection in PyTorch

pytorch-hed This is a personal reimplementation of Holistically-Nested Edge Detection [1] using PyTorch. Should you be making use of this work, please

Simon Niklaus 375 Dec 06, 2022
Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. CVPR 2018

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning Tensorflow code and models for the paper: Large Scale Fine-Grained Categ

Yin Cui 187 Oct 01, 2022
PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

IIM - Crowd Localization This repo is the official implementation of paper: Learning Independent Instance Maps for Crowd Localization. The code is dev

tao han 91 Nov 10, 2022
Monk is a low code Deep Learning tool and a unified wrapper for Computer Vision.

Monk - A computer vision toolkit for everyone Why use Monk Issue: Want to begin learning computer vision Solution: Start with Monk's hands-on study ro

Tessellate Imaging 507 Dec 04, 2022
[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

TransFuser This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our

695 Jan 05, 2023
A more easy-to-use implementation of KPConv based on PyTorch.

A more easy-to-use implementation of KPConv This repo contains a more easy-to-use implementation of KPConv based on PyTorch. Introduction KPConv is a

Zheng Qin 36 Dec 29, 2022
Deep Markov Factor Analysis (NeurIPS2021)

Deep Markov Factor Analysis (DMFA) Codes and experiments for deep Markov factor analysis (DMFA) model accepted for publication at NeurIPS2021: A. Farn

Sarah Ostadabbas 2 Dec 16, 2022
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

551 Dec 29, 2022
Pytorch implementation for "Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion" (NeurIPS 2021)

Density-aware Chamfer Distance This repository contains the official PyTorch implementation of our paper: Density-aware Chamfer Distance as a Comprehe

Tong WU 93 Dec 15, 2022
Multi-view 3D reconstruction using neural rendering. Unofficial implementation of UNISURF, VolSDF, NeuS and more.

Volume rendering + 3D implicit surface Showcase What? previous: surface rendering; now: volume rendering previous: NeRF's volume density; now: implici

Jianfei Guo 682 Jan 04, 2023
NEO: Non Equilibrium Sampling on the orbit of a deterministic transform

NEO: Non Equilibrium Sampling on the orbit of a deterministic transform Description of the code This repo describes the NEO estimator described in the

0 Dec 01, 2021
ScriptProfilerPy - Module to visualize where your python script is slow

ScriptProfiler helps you track where your code is slow It provides: Code lines t

Lucas BLP 3 Jun 02, 2022
Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs, ICCV 2021

Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs, ICCV 2021 Global Pooling, More than Meets the Eye: Posi

Md Amirul Islam 32 Apr 24, 2022
[CVPR 2022] Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement

Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement Announcement 🔥 We have not tested the code yet. We will fini

Xiuwei Xu 7 Oct 30, 2022
On-device speech-to-index engine powered by deep learning.

On-device speech-to-index engine powered by deep learning.

Picovoice 30 Nov 24, 2022