[NeurIPS 2021] Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

Last update: Sep 20, 2022

Related tags

Overview

Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

Code for NeurIPS 2021 paper "Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training" by Lue Tao, Lei Feng, Jinfeng Yi, Sheng-Jun Huang, and Songcan Chen.
This repository contains an implementation of the attacks (P1~P5) and the defense (adversarial training) in the paper.

Requirements

Our code relies on PyTorch, which will be automatically installed when you follow the instructions below.

conda create -n delusion python=3.8
conda activate delusion
pip install -r requirements.txt

Running Experiments

Pre-train a standard model on CIFAR-10 (the dataset will be automatically download).

python main.py --train_loss ST

Generate perturbed training data.

python poison.py --poison_type P1
python poison.py --poison_type P2
python poison.py --poison_type P3
python poison.py --poison_type P4
python poison.py --poison_type P5

Visualize the perturbed training data (optional).

tensorboard --logdir ./results

Standard training on the perturbed data.

python main.py --train_loss ST --poison_type P1
python main.py --train_loss ST --poison_type P2
python main.py --train_loss ST --poison_type P3
python main.py --train_loss ST --poison_type P4
python main.py --train_loss ST --poison_type P5

Adversarial training on the perturbed data.

python main.py --train_loss AT --poison_type P1
python main.py --train_loss AT --poison_type P2
python main.py --train_loss AT --poison_type P3
python main.py --train_loss AT --poison_type P4
python main.py --train_loss AT --poison_type P5

Results

Figure 1: An illustration of delusive attacks and adversarial training. Left: Random samples from the CIFAR-10 training set: the original training set D and the perturbed training set D_P5 generated using the P5 attack. Right: Natural accuracy evaluated on the CIFAR-10 test set for models trained with: i) standard training on D; ii) adversarial training on D; iii) standard training on D_P5; iv) adversarial training on D_P5. While standard training on D_P5 incurs poor generalization performance on D, adversarial training can help a lot.

Table 1: Below we report mean and standard deviation of the test accuracy for the CIFAR-10 dataset. As we can see, the performance deviations of the defense (i.e., adversarial training) are very small (< 0.50%), which hardly effect the results. In contrast, the results of standard training are relatively unstable.

Training method \ Training data	P1	P2	P3	P4	P5
Standard training	37.87±0.94	74.24±1.32	15.14±2.10	23.69±2.98	11.76±0.72
Adversarial training	86.59±0.30	89.50±0.21	88.12±0.39	88.15±0.15	88.12±0.43

Key takeaways: Our theoretical justifications in the paper, along with the empirical results, suggest that adversarial training is a principled and promising defense against delusive attacks.

Citing this work

@inproceedings{tao2021better,
    title={Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training},
    author={Tao, Lue and Feng, Lei and Yi, Jinfeng and Huang, Sheng-Jun and Chen, Songcan},
    booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
    year={2021}
}

[NeurIPS 2021] Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

Related tags

Overview

Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

Requirements

Running Experiments

Results

Citing this work

Owner

Lue Tao

Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.

[AI6101] Introduction to AI & AI Ethics is a core course of MSAI, SCSE, NTU, Singapore

Dynamic View Synthesis from Dynamic Monocular Video

StarGAN v2 - Official PyTorch Implementation (CVPR 2020)

Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Scene-Text-Detection-and-Recognition (Pytorch)

Official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

Semi-supervised semantic segmentation needs strong, varied perturbations

A collection of resources and papers on Diffusion Models, a darkhorse in the field of Generative Models

Implementation of Stochastic Image-to-Video Synthesis using cINNs.

Implementation of DropLoss for Long-Tail Instance Segmentation in Pytorch

Fast, differentiable sorting and ranking in PyTorch

Random Walk Graph Neural Networks

Implementation of BI-RADS-BERT & The Advantages of Section Tokenization.

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

PixelPyramids: Exact Inference Models from Lossless Image Pyramids (ICCV 2021)

Res2Net for Instance segmentation and Object detection using MaskRCNN

Source code for PairNorm (ICLR 2020)

This computer program provides a reference implementation of Lagrangian Monte Carlo in metric induced by the Monge patch