What tools are used to deal with adversarial examples problem?

Question

The problem of adversarial examples is known to be critical for neural networks. For example, an image classifier can be manipulated by additively superimposing a different low amplitude image to each of many training examples that looks like noise but is designed to produce specific misclassifications.

Since neural networks are applied to some safety-critical problems (e.g. self-driving cars), I have the following question

What tools are used to ensure safety-critical applications are resistant to the injection of adversarial examples at training time?

Laboratory research aimed at developing defensive security for neural networks exists. These are a few examples.

adversarial training (see e.g. A. Kurakin et al., ICLR 2017)
defensive distillation (see e.g. N. Papernot et al., SSP 2016)
MMSTV defence (Maudry et al., ICLR 2018).

However, do industrial-strength, production-ready defensive strategies and approaches exist? Are there known examples of applied adversarial-resistant networks for one or more specific types (e.g. for small perturbation limits)?

There are already (at least) two questions related to the problem of hacking and fooling of neural networks. The primary interest of this question, however, is whether any tools exist that can defend against some adversarial example attacks.

[Here](https://ai.stackexchange.com/q/15820/2444) and [here](https://ai.stackexchange.com/q/26448/2444) are two related questions. — nbro, Jan 14 '22 at 12:04

Dennis Soemers · Accepted Answer · 2018-07-25T17:55:20.797

However, do industrial strength, production ready defensive strategies and approaches exist? Are there known examples of applied adversarial-resistant networks for one or more specific types (e.g. for small perturbation limits)?

I think it's difficult to tell whether or not there are any industrial strength defenses out there (which I assume would mean that they'd be reliable against all or most known methods of attacking). Adversarial Machine Learning is indeed a highly active, and growing, area of research. Not only are new approaches for defending being published quite regularly, but there is also active research into different approaches for "attacking". With new attack methods being discovered frequently, it's unlikely that anyone can already claim to have approaches that would work reliably against them all.

The primary interest of this question, however, is whether any tools exist that can defend against some adversarial example attacks.

The closest thing to a ready-to-use "tool" that I've been able to find is IBM's Adversarial Robustness Toolbox, which appears to have various attack and defense methods implemented. It appears to be in active development, which is natural considering the area of research itself is also highly active. I've never tried using it, so I can't vouch personally for the extent to which it's easily usable as a tool for industry, or if it's maybe really only still suitable for research.

Based on comments by Ilya, other frameworks that may be useful to consider are Cleverhans and Foolbox.

I'd like to add [CleverHans](https://github.com/tensorflow/cleverhans) and [Foolbox](https://github.com/bethgelab/foolbox), which are the competitors of IBM Adversarial Robustness toolbox. Do you know how these three solutions can be compared? — Ilya Palachev, Jul 25 '18 at 16:21
@IlyaPalachev From a quick glance through their descriptions, I got the impression that those repositories only contain attack methods, not defense methods. I'm not familiar enough with them to tell for sure if that's correct, that's just the impression I got. Obviously, such repositories would still be used for benchmarking defense methods, so searching for places that cite usage of these repositories may lead to interesting defense methods. — Dennis Soemers, Jul 25 '18 at 17:02
CleverHans guys are [implementing](https://github.com/tensorflow/cleverhans/pull/423) defense sub-framework now. In recent [survey](https://dl.acm.org/citation.cfm?id=3134599) about this area they also claim CleverHans to contain "reference implementations of several attack and defense procedures" (p. 85). — Ilya Palachev, Jul 25 '18 at 17:43
@IlyaPalachev Ah I see. For the sake of completeness, I edited them into my answer, where they might be easier to see for future visitors of the site than in the comments. — Dennis Soemers, Jul 25 '18 at 17:56

score 1 · Answer 2 · answered Mar 03 '21 at 08:40

Another point of view -
In safety-critical real world systems, this attack should be evaluated from other aspects as well.
In many systems the attack is somewhat mitigated to physical attacks only - for example, you can't add digital noise to a camera used for autonomous driving - you need to print an adversarial e.g. stop sign and locate it in a place, where it's still viewed and interpreted incorrectly from several points of view, angles, light and whether conditions, etc.
Given that, I think that the overall current risk of adversarial examples for scalable attacks on real-world mission-critical systems isn't very high for now.
That's why such work is existing in companies at research level, but not production, yet.

What tools are used to deal with adversarial examples problem?

2 Answers2

Linked