Is RobustBench/AutoAttack a suitable Benchmark for Adversarial Robustness?

Last update: Nov 27, 2022

Overview

Adversrial Machine Learning Benchmarks

This code belongs to the papers:

For this framework, please cite:

@inproceedings{
lorenz2022is,
title={Is AutoAttack/AutoBench a suitable Benchmark for Adversarial Robustness?},
author={Peter Lorenz and Dominik Strassel and Margret Keuper and Janis Keuper},
booktitle={The AAAI-22 Workshop on Adversarial Machine Learning and Beyond},
year={2022},
url={https://openreview.net/forum?id=aLB3FaqoMBs}
}

This repository is an expansion of https://github.com/paulaharder/SpectralAdversarialDefense, but has some new features:

Several runs can be saved for calculating the variance of the results.
new attack method: AutoAttack.
datasets: imagenet32, imagenet64, imagenet128, imagenet, celebahq32, celebahq64, and celebahq128.
new model: besides VGG-16 we trained a model WideResNet28-10, except for imagenet (used the standard pytorch model.)
bash scripts: Automatic starts various combination of input parameters
automatic .csv creation from all results.

Overview

This image shows the pipeline from training a model, generating adversarial examples to defend them.

Training: Models are trained. Pre-trained models are provided (WideResNet28-10: cif10, cif100, imagenet32, imagenet64, imagenet128, celebaHQ32, celebaHQ64, celebaHQ128; WideResNet51-2: ImageNet; VGG16: cif10 and cif100)
Generate Clean Data: Only correctly classfied samples are stored via torch.save.
Attacks: On this clean data severa atttacks can be executed: FGSM, BIM, AutoAttack (Std), PGD, DF and CW.
Detect Feature: Detectors try to distinguish between attacked and not-attacked images.
Evaluation Detect: Is the management script for handling several runs and extract the results to one .csv file.

Requirements

GPUs: A100 (40GB), Titan V (12GB) or GTX 1080 (12GB)
CUDA 11.1
Python 3.9.5
PyTorch 1.9.0
cuDNN 8.0.5_0

Clone the repository

$ git clone --recurse-submodules https://github.com/adverML/SpectralDef_Framework
$ cd SpectralDef_Framework

and install the requirements

$ conda create --name cuda--11-1-1--pytorch--1-9-0 -f requirements.yml
$ conda activate cuda--11-1-1--pytorch--1-9-0

There are two possiblities: Either use our data set with existing adversarial examples (not provided yet), in this case follow the instructions under 'Download' or generate the examples by yourself, by going threw 'Data generation'. For both possibilities conclude with 'Build a detector'.

Download

Download the adversarial examples (not provided yet) and their non-adversarial counterparts as well as the trained VGG-16 networks from: https://www.kaggle.com/j53t3r/weights. Extract the folders for the adversarial examples into /data and the models in the main directory. Afterwards continue with 'Build detector'.

Datasets download

These datasets are supported:

Download and copy the weights into data/datasets/. In case of troubles, adapt the paths in conf/global_settings.py.

Model download

To get the weights for all networks for CIFAR-10 and CIFAR-100, ImageNet and CelebaHQ download:

Kaggle Download Weights
Copy the weights into data/weights/.

In case of troubles, adapt the paths in conf/global_settings.py. You are welcome to create an issue on Github.

Data generation

Train the VGG16 on CIFAR-10:

$ python train_cif10.py

or on CIFAR-100

$ python train_cif100.py

The following skript will download the CIFAR-10/100 dataset and extract the CIFAR10/100 (imagenet32, imagenet64, imagenet128, celebAHQ32, ...) images, which are correctly classified by the network by running. Use --net cif10 for CIFAR-10 and --net cif100 for CIFAR-100

$ # python generate_clean_data.py -h  // for help
$ python generate_clean_data.py --net cif10

Then generate the adversarial examples, argument can be fgsm (Fast Gradient Sign Method), bim (Basic Iterative Method), pgd (Projected Gradient Descent), [new] std (AutoAttack Standard), df (Deepfool), cw (Carlini and Wagner), :

$ # python attack.py -h  // for help
$ python attack.py --attack fgsm

Build detector

First extract the necessary characteristics to train a detector, choose a detector out of InputMFS (BlackBox - BB), InputPFS, LayerMFS (WhiteBox - WB), LayerPFS, LID, Mahalanobis adn an attack argument as before:

$ # python extract_characteristics.py -h  // for help
$ python extract_characteristics.py --attack fgsm --detector InputMFS

Then, train a classifier on the characteristics for a specific attack and detector:

$ python detect_adversarials.py --attack fgsm --detector InputMFS

[new] Create csv file

At the end of the file evaluation_detection.py different possibilities are shown:

$ python evaluation_detection.py

Note that: layers=False for evaluating the detectors after the the right layers are selected.

Other repositories used

For training the VGG-16 on CIFAR-10 we used: https://github.com/kuangliu/pytorch-cifar.
For training on CIFAR-100: https://github.com/weiaicunzai/pytorch-cifar100.
For training on imagenet32 (64 or 128) and celebaHQ32 (64 or 128) https://github.com/bearpaw/pytorch-classification.
For generating the adversarial examples we used the toolbox foolbox: https://github.com/bethgelab/foolbox.
For the LID detector we used: https://github.com/xingjunm/lid_adversarial_subspace_detection.
For the Mahalanobis detector we used: https://github.com/pokaxpoka/deep_Mahalanobis_detector.
For the AutoAttack detector we used: https://github.com/adverML/auto-attack/tree/forspectraldefense. This one is already added as: git submodule add -b forspectraldefense [email protected]:adverML/auto-attack.git submodules/autoattack
Other detectors: https://github.com/jayaram-r/adversarial-detection.

Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness

Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness Code for Paper "Imbalanced Gradients: A Subtle Cause of Overestimated Adv

11 Nov 30, 2022

Code repository accompanying the paper "On Adversarial Robustness: A Neural Architecture Search perspective"

On Adversarial Robustness: A Neural Architecture Search perspective Preparation: Clone the repository: https://github.com/tdchaitanya/nas-robustness.g

4 Nov 10, 2022

Hierarchical-Bayesian-Defense - Towards Adversarial Robustness of Bayesian Neural Network through Hierarchical Variational Inference (Openreview)

Towards Adversarial Robustness of Bayesian Neural Network through Hierarchical V

20 Dec 2, 2022

Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Flickr-Faces-HQ Dataset (FFHQ) Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative

2.9k Dec 28, 2022

Universal Adversarial Examples in Remote Sensing: Methodology and Benchmark

Adversarial-Information-Bottleneck - Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (NeurIPS21)

NeurIPS 2021 Title: Distilling Robust and Non-Robust Features in Adversarial Exa

35 Dec 26, 2022

Is RobustBench/AutoAttack a suitable Benchmark for Adversarial Robustness?

Related tags

Overview

Adversrial Machine Learning Benchmarks

Overview

Requirements

Download

Datasets download

Model download

Data generation

Build detector

[new] Create csv file

Other repositories used

You might also like...

Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness

Code repository accompanying the paper "On Adversarial Robustness: A Neural Architecture Search perspective"

Hierarchical-Bayesian-Defense - Towards Adversarial Robustness of Bayesian Neural Network through Hierarchical Variational Inference (Openreview)

Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Universal Adversarial Examples in Remote Sensing: Methodology and Benchmark

Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

Adversarial Color Enhancement: Generating Unrestricted Adversarial Images by Optimizing a Color Filter

transfer attack; adversarial examples; black-box attack; unrestricted Adversarial Attacks on ImageNet; CVPR2021 天池黑盒竞赛

Adversarial-Information-Bottleneck - Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (NeurIPS21)

Releases(v1.0.7)

v1.0.7(May 1, 2022)

Owner

Adversarial Machine Learning

Implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

Code for our paper Aspect Sentiment Quad Prediction as Paraphrase Generation in EMNLP 2021.

LONG-TERM SERIES FORECASTING WITH QUERYSELECTOR – EFFICIENT MODEL OF SPARSEATTENTION

XViT - Space-time Mixing Attention for Video Transformer

This is the dataset and code release of the OpenRooms Dataset.

PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning.

Neural Module Network for VQA in Pytorch

A large-scale benchmark for co-optimizing the design and control of soft robots, as seen in NeurIPS 2021.

Code release for NeurIPS 2020 paper "Co-Tuning for Transfer Learning"

This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' published at ECIR'22.

AniGAN: Style-Guided Generative Adversarial Networks for Unsupervised Anime Face Generation

AISTATS 2019: Confidence-based Graph Convolutional Networks for Semi-Supervised Learning

NeurIPS'21 Tractable Density Estimation on Learned Manifolds with Conformal Embedding Flows

Many Class Activation Map methods implemented in Pytorch for CNNs and Vision Transformers. Including Grad-CAM, Grad-CAM++, Score-CAM, Ablation-CAM and XGrad-CAM

Header-only library for using Keras models in C++.

[ICML 2020] "When Does Self-Supervision Help Graph Convolutional Networks?" by Yuning You, Tianlong Chen, Zhangyang Wang, Yang Shen

CVPRW 2021: How to calibrate your event camera

A new data augmentation method for extreme lighting conditions.

Keras Image Embeddings using Contrastive Loss

(NeurIPS '21 Spotlight) IQ-Learn: Inverse Q-Learning for Imitation