Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet

Overview

Attack_classification_models_with_transferability

Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet, CVPR2021 安全AI挑战者计划第六期:ImageNet无限制对抗攻击, 决赛第四名(team name: Advers)

详细方案介绍

1. Prerequisites

1. python >= 3.6
2. pytorch >= 1.2.0
3. torchvision >= 0.4.0 
4. numpy >= 1.19.1
5. pillow >= 7.2.0
6. scipy >= 1.5.2

2. Code Overview

  • ./codes/

    • main.py: 攻击原始图像,生成并保存攻击后的图像
    • data.py: 加载原始图像,保存图像,图像标准化处理
    • model.py: 模型集成,利用集成模型计算logits
    • utils.py: Input Diversity, 高斯平滑处理等
  • ./input_dir/

    • ./images/: 原始图像所在路径
    • ./dev.csv:图像的标记文件(images name, true label)
  • demo

python main.py --source_model 'resnet50'

3. 思路

  • 本文分享我们团队(Advers)的解决方案,欢迎大家交流讨论,一起进步。
  • 本方案最终得分:9081.6, 线上后台模型攻击成功率:95.48%,决赛排名:TOP 4
  • 本方案初赛排名:TOP 4,复赛排名:TOP 10

3.1 赛题分析

  1. 无限制对抗攻击可以用不同的方法来实现,包括范数扰动攻击、GAN、粘贴Patch等。但是由于 fid、lpips 两个指标的限制,必须保证生成的图像质量好(不改变语义、噪声尽量小),否则得分会很低。经过尝试,我们最终确定利用范数扰动来进行迁移攻击,这样可以较好地平衡攻击成功率和图像质量。

  2. 由于无法获取后台模型的任何参数和输出,甚至不知道后台分类模型输入图像大小,这增加了攻击难度。原图大小是 500 * 500,而 ImageNet 分类模型输入是 224 * 224 或 299 * 299,对生成的对抗样本图像 resize会导致对抗样本的攻击性降低。

  3. 由于比赛最终排名为人工打分,所以没有用损失函数去拟合 fid、lpips 两个指标。

  4. 对抗样本的攻击性和图像质量可以说是两个相互矛盾的指标,一个指标的提升往往会导致另一个指标的下降,如何在对抗性和图像质量两个方面找到一个平衡点是十分重要的。在机器打分阶段,采用较小的噪声,把噪声加在图像敏感区域,在尽量不降低攻击性的前提下提升对抗样本的图像质量是得分的关键

3.2 解题思路

3.2.1 输入模型的图像大小

本次比赛的图像被 resize 到了 500 * 500 大小,而标准的 ImageNet 预训练模型输入大小一般是 224 * 224 或 299 * 299。我们尝试将不同大小的图片(500,299,224)输入到模型中进行攻击,发现 224 大小的效果最好,计算复杂度也最低。

3.2.2 L2 or Linf

采用 L2 范数攻击生成的对抗样本的攻击性要强一些,但可能会出现比较大的噪声斑块,导致人眼看起来比较奇怪,采用 Linf 范数生成的对抗样本,人眼视觉上要稍好一些。在机器打分阶段,采用 L2 范数扰动攻击,在人工评判阶段,采用 Linf 范数扰动来生成对抗样本。

3.2.3 提升对抗样本迁移性方法

1. MI-FGSM1:在机器打分阶段采用 MI-FGSM 算法生成噪声,但是 MI-FGSM 算法生成的噪声人眼看起来会明显,由于决赛阶段是人工打分,最终舍弃了该方法。

2. Translation-Invariant(TI)2:用核函数对计算得到的噪声梯度进行平滑处理,提升了噪声的泛化性。

3. Input Diversity(DI)3 :通过增加输入图像的多样性来提高对抗样本的迁移性,其提分效果明显。Input Diversity 本质是通过变换输入图像的多样性让噪声不完全依赖相应的像素点,减少了噪声过拟合效应,提高了泛化性和迁移性。

3.2.4 改进后的DI攻击

Input Diversity 会对图像进行随机变换,导致生成的噪声梯度带有一定的随机性。虽然这种随机性可以使对抗样本的泛化性更强,但是也会引入一定比例的噪声,这种噪声也会抑制对抗样本的泛化性,因此如何消除 DI 随机性带来的噪声影响,同时保证攻击具有较强的泛化性是提升迁移性的有效手段。

image

3.2.5 Tricks

  • 在初赛和复赛阶段,采用 L2 和 Linf 范数扰动攻击,其中 L2 范数扰动攻击得分更高一些。由于复赛阶段线上模型比较鲁棒,所以适当增加扰动范围是提升攻击成功率的关键。
  • 考虑到决赛阶段是人工打分,需要考虑攻击性和图像质量,我们最终采用 Linf 范数扰动进行攻击,扰动大小设为 32/255,迭代次数设为 40,迭代步长设为 1/255。
  • 攻击之前,对图像进行高斯平滑处理,可以提升攻击效果,但是也会让图像变模糊。
  • Ensemble models: resnet50、densenet161、inceptionv4等。

4. 攻击结果

image

多次实验表明,采用改进的 DI+TI 攻击方法得到的噪声相对于 MI-FGSM 方法更小,泛化性和迁移性更强,同时人眼视觉效果也比较好。

5. 参考文献

  1. Dong Y, Liao F, Pang T, et al. Boosting adversarial attacks with momentum. CVPR 2018.
  2. Dong Y, Pang T, Su H, et al. Evading defenses to transferable adversarial examples by translation-invariant attacks. CVPR 2019.
  3. Xie C, Zhang Z, Zhou Y, et al. Improving transferability of adversarial examples with input diversity. CVPR 2019.
  4. Wierstra D, Schaul T, Glasmachers T, et al. Natural evolution strategies. The Journal of Machine Learning Research, 2014.

6.致谢

  • 感谢团队每一位小伙伴的辛勤付出,感谢指导老师的大力支持。
  • 感谢阿里安全主办了这次比赛,给了大家交流学习的机会,使得我们结识了很多优秀的小伙伴!

如有问题,欢迎交流:[email protected]

VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning

This is a release of our VIMPAC paper to illustrate the implementations. The pretrained checkpoints and scripts will be soon open-sourced in HuggingFace transformers.

Hao Tan 74 Dec 03, 2022
Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language

Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language This repository contains the code, model, and deployment config

16 Oct 23, 2022
The self-supervised goal reaching benchmark introduced in Discovering and Achieving Goals via World Models

Lexa-Benchmark Codebase for the self-supervised goal reaching benchmark introduced in 'Discovering and Achieving Goals via World Models'. Setup Create

1 Oct 14, 2021
Implementation of FSGNN

FSGNN Implementation of FSGNN. For more details, please refer to our paper Experiments were conducted with following setup: Pytorch: 1.6.0 Python: 3.8

19 Dec 05, 2022
AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation A pytorch-version implementation codes of paper:

11 Dec 13, 2022
Omniverse sample scripts - A guide for developing with Python scripts on NVIDIA Ominverse

Omniverse sample scripts ここでは、NVIDIA Omniverse ( https://www.nvidia.com/ja-jp/om

ft-lab (Yutaka Yoshisaka) 37 Nov 17, 2022
A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

collie Collie is a library for preparing, training, and evaluating implicit deep learning hybrid recommender systems, named after the Border Collie do

ShopRunner 96 Dec 29, 2022
9th place solution

AllDataAreExt-Galixir-Kaggle-HPA-2021-Solution Team Members Qishen Ha is Master of Engineering from the University of Tokyo. Machine Learning Engineer

daishu 5 Nov 18, 2021
Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

Han Xu 129 Dec 11, 2022
Computer Vision application in the web

Computer Vision application in the web Preview Usage Clone this repo git clone https://github.com/amineHY/WebApp-Computer-Vision-streamlit.git cd Web

Amine Hadj-Youcef. PhD 35 Dec 06, 2022
Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

Transfer Style API It's an API to use with Tranfer Style App, where you can use

Brian Alejandro 1 Feb 13, 2022
For AILAB: Cross Lingual Retrieval on Yelp Search Engine

Cross-lingual Information Retrieval Model for Document Search Train Phase CUDA_VISIBLE_DEVICES="0,1,2,3" \ python -m torch.distributed.launch --nproc_

Chilia Waterhouse 104 Nov 12, 2022
Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging"

Deep Optics for Single-shot High-dynamic-range Imaging Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging" CVPR, 2

Stanford Computational Imaging Lab 40 Dec 12, 2022
Seeing if I can put together an interactive version of 3b1b's Manim in Streamlit

streamlit-manim Seeing if I can put together an interactive version of 3b1b's Manim in Streamlit Installation I had to install pango with sudo apt-get

Adrien Treuille 6 Aug 03, 2022
Keras Implementation of The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation by (Simon Jégou, Michal Drozdzal, David Vazquez, Adriana Romero, Yoshua Bengio)

The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation: Work In Progress, Results can't be replicated yet with the m

Yad Konrad 196 Aug 30, 2022
This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction".

TreePartNet This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction". Depende

刘彦超 34 Nov 30, 2022
Federated Deep Reinforcement Learning for the Distributed Control of NextG Wireless Networks.

FDRL-PC-Dyspan Federated Deep Reinforcement Learning for the Distributed Control of NextG Wireless Networks. This repository contains the entire code

Peyman Tehrani 17 Nov 18, 2022
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

DiffWave DiffWave is a fast, high-quality neural vocoder and waveform synthesizer. It starts with Gaussian noise and converts it into speech via itera

LMNT 498 Jan 03, 2023
Official Repository for Machine Learning class - Physics Without Frontiers 2021

PWF 2021 Física Sin Fronteras es un proyecto del Centro Internacional de Física Teórica (ICTP) en Trieste Italia. El ICTP es un centro dedicado a fome

36 Aug 06, 2022
STRIVE: Scene Text Replacement In Videos

STRIVE: Scene Text Replacement In Videos Dataset Types: RoboText SynthText RealWorld videos RoboText : Videos of texts collected using navigation robo

15 Jul 11, 2022