0

Why couldn't you take the image an AI is given and apply several different random noise filters to the image and take the democratically most common response and use that for the output of the AI. As it stands adversarial attacks require the noise they add to the image to be very specific, but if you just applied light noise that disrupted the attack while using multiple versions so you don't disrupt yourself, couldn't that counter the attack? And since the noise is runtime random the attacker would have no way of working around it.

Ethan
  • 111
  • 6

1 Answers1

1

Adding noise to the signal after training:

It might disrupt attacks unaware of the noise, no guarantees. Hopefully the adversary doesn't wise up to the noise and train their model with the noise in mind. Enter a cat and mouse game. Noise of course can reduce accuracy of some models.

Adding noise to the signal during training:

Accuracy in the face of noise is a bigger problem than accuracy without noise. Bigger problems generally requires a bigger models. Depending on the platform there may not be budget for a bigger model (car ECU, phone, etc).

And of course it's hard to tell the difference between generalizing to all noise versus generalizing against specific noise. If the model is not generalized against all noise, there is possibly noise that still exists that causes undesirable behavior. Welcome back to the previously discussed cat and mouse game.

foreverska
  • 178
  • 7
  • How do you train a neural network generating images to remain adversarial in the presence of random noise? Like how does it train the work around random values when it doesn't know what they're going to be? – Ethan Aug 24 '23 at 19:31
  • Neural networks can be described as making simplifying assumptions driven by the constraints under which it was trained. It has not seen every car, yet it can identify a previously unseen car because of it's simplifying assumptions. A network hellbent of causing another network to misclassify something, given the constraint that there will be noise, will make a simplifying assumption that it's perturbations must survive noise. It may not work on all noise just as the same as a classifier might miss a car but it could still be effective enough. – foreverska Aug 24 '23 at 20:06
  • The adversary's life will be made much easier if they happen to know what type of noise will be inserted (by reverse engineering, etc) so this can be included in training. – foreverska Aug 24 '23 at 20:10