-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Epsilon values studied are too large to be meaningful #7
Comments
It is important to choose a suitable distortion value eps for attacks, but current works of literature do not have a uniform standard value. It is noticed that the epsilon of gradient-based attacks ranges from 0 to 32 in related papers. On the other hand, those attacks with distortion eps greater than 0.1 are only used to compare with eps = 0.1. As we stated in the paper, Table VII shows the results of a simple test on FGSM with different epsilon values, and the only observation obtained is there is no clear relationship between the magnitude of perturbation of adversarial examples and the detection AUC value. |
So we agree that your choice of 32 is at least twice as large as is reasonable and studied in other work. No other paper studies exclusively 32/256 on CIFAR-10. Putting aside this one issue for a second, do you have a justification for why you consider even larger distortion bounds of, say, 0.6. At a deeper level, my concern here is that when you are evaluating papers, you have to be extremely careful to be sure you're treating them properly. The Madry et al. model for example is trained to be robust to eps=8, so why should we compare it to other models at eps=32? It's fundamentally an unfair comparison. But I don't understand how you say that most papers consider 16/255. Let's briefly look at recently accepted papers that study the CIFAR-10 dataset. There are some papers that use a range of parameters There are other papers that are vague on details, but I can find no evidence for the argument that "most papers" use 16/255. Do you have examples of these papers that study primarily 16/255? |
The argument we made before in this issue is a little strong, and I will modify it as "It is noticed that the epsilon of gradient-based attacks ranges from 0 to 32 in related papers." In fact, "Towards Deep Learning Models Resistant to Adversarial Attacks" uses 8 for training, but uses adversarial examples with epsilon range from 0 to 30. Detailed can be found in Figure 6 of the paper.
Also, in the paper of “FGSM” (Ian J. Goodfellow et al. ICLR 2015), the eps value is set to 0.1 for CIFAR10 initially. Therefore, we set epsilon=0.1 for all gradient-based attacks in our experiments. |
Yeah, it's important to note that all of these other papers study a range of values. And it's good that they do: especially in defense work it is useful to know that the attack success rate degrades to 0% accuracy as epsilon grows to very large values. If you're only going to pick a single value to study the robustness of defenses, then you should pick a representative value. And other than a single paragraph of Goodfellow et al. (2015), which is not evaluating the robustness of defenses but examining how linear neural networks are to large step sizes, I am aware of no paper that paper studies exclusively 32 (or even 16) on CIFAR-10. But again: could you please comment on if you have justification for the even larger values, such as 0.6? |
On at least two counts the paper choses l_infinity distortion bounds that are not well motivated.
Throughout the paper the report studies a CIFAR-10 distortion of eps=0.1 and eps=0.2. This value is 3x (or 6x) larger than what is typically studied in the literature. When CIFAR-10 images are perturbed with noise of distortion 0.1, they are often difficult for humans to correctly classify; I'm aware of no other work which studies CIFAR-10 robustness at this extremely high distortion bound.
The studies l_infinity distortion bounds as high as eps=0.6 in Table VII on both MNIST and CIFAR-10, a value that is so high that any image can be converted to solid grey (and then past). The entire purpose of bounding the l_infinity norm of adversarial examples is to ensure that the actual true class has not changed. Choosing a distortion bound so large that all images can be converted to a solid grey image fundamentally misunderstands the purpose of the distortion bound.
The text was updated successfully, but these errors were encountered: