当前位置：网站首页>Dropping Pixels for Adversarial Robustness

Dropping Pixels for Adversarial Robustness

2022-04-23 07:46:00 【Apple Laboratory of Central South University】

author ： 19 the lz

The paper ：《Dropping Pixels for Adversarial Robustness》

problem :

Deep neural networks are vulnerable to adversarial examples , An adversarial example is an input that is deliberately designed to cause a model error .
These images are misclassified by the model , But humans can recognize .
This confrontational image is usually by adding a bounded... To the legitimate input L0、L2 or L∞ Small perturbation of norm to generate

contribution :

We show that the image classifier can sample pixels randomly , Training with reduced redundant inputs , Without significantly reducing accuracy . We show that , When used in [0, 1] The randomly selected discarding rate in the sub sampled image training model , You can get the best results .

We apply the interpretability method to the model trained with secondary sampling images , It is considered that this method can not explain how the model recognizes the image from several pixels . We also visualize the convolution filter in the first layer of the network , And show , In this regard , The behavior of the model is similar to that of a network trained with confrontation training .

How to use this insight to train a robust classifier without confrontation training .

Research process and results ：

Due to the strong correlation between adjacent pixels , Image data contains high redundancy , namely , Even if most pixels are deleted , You can also restore the image . therefore , On the condition of selecting a pixel , The correlation between the surrounding pixels and the output is weak , Because they overlap significantly with the center pixel in content , Removing them will not result in a significant reduction in accuracy . therefore , A direct method to construct robust features is to down sample image pixels . Because the farther pixels have less correlation , Therefore, they have important contributions to the prediction of the model , Therefore, it is considered to be a robust feature .
Insert picture description here

A higher pixel rejection rate will result in lower accuracy . However , Even at very high discard rates , The accuracy is still high .

Insert picture description here
Above is CIFAR10 Result of dataset . In the experiment 1 in , Use the original image to train and test the model . In the experiment 2 in , Model with 90% The second sampling image is trained and tested . In the experiment 3 in , Model used in [0, 1] Uniformly selected down sampled images in training , And test it on the down sampled image , The lower sampling rate is 90%.

effect :
Deeper networks perform better
The discard rate is randomly selected in each period 0% To 100% Between time , The model can achieve the best effect .

In order to prevent the model may have learned to generate similar representations for the original image and the down sampled image ., We train the model to classify the sub sampled images into their real labels , At the same time, the original image is mapped to a uniform distribution . A training method makes the accuracy of the network on the sub sampled image reach 78.9%（ The rate of decline is 90%）, It is only lower than the model trained with sub sampled images 2% about . It turns out that , The network can classify sub sampled images , Without actually learning the features of natural images .
Insert picture description here

The sub sampled images are classified into their real labels , At the same time, the sub sampled noise image is mapped to a uniformly distributed image . The trained model achieves 80.9% The accuracy of , This is almost the same as the model trained using only sub sampled images . The following figure shows the interpretation of a few images . For this model , The interpretation of the original image has nothing to do with the edge pattern . Besides , And 3a and 3b comparison , The interpretation of the secondary sampling image is more sparse . Besides , Most of the larger gradient values are located where the pixels are not discarded .
Insert picture description here

Visual convolution filter

Three situations , One is the model of normal training , One is to use 90% The model of subsampled image training with discarding rate , One use [0, 1] A model for training subsampled images with randomly selected rejection rates .

Insert picture description here

The model trained with secondary sampling image only has the filter with large value in the center . This means that the network recognizes that there is no spatial correlation between adjacent pixels , Therefore, only a few scaled versions of the image need to be passed to the next layer .

conclusion

In this paper , We show that image classifiers can be trained to recognize images with high rejection rate . then , We recommend using in [0, 1] The model is trained by sub sampling images with randomly selected rejection rate . We are GTSRB and CIFR10 Experimental results on data sets show that , These models are in L0、L2 and L∞ The robustness of the adversarial example is improved in all cases of disturbance , At the same time, the standard accuracy is reduced by a very small value .

版权声明
本文为[Apple Laboratory of Central South University]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/04/202204230626397607.html