2

A common illustration on how CNN works is as follows: https://www.researchgate.net/figure/Learned-features-from-a-Convolutional-Neural-Network_fig1_319253577. It seems to suggest that CNN in particular classifies images in a similar manner as human does (i.e., based on visual features).

So why do adversarial attacks, such as FGSM ,still work on CNNs? If the perturbation is strong enough for CNN to pick up, shouldn't human also be able to tell the difference?

nbro
  • 39,006
  • 12
  • 98
  • 176
Sam
  • 175
  • 5
  • I don't understand this question "If the perturbation is strong enough for CNN to pick up, shouldn't human also be able to tell the difference?"? Are you interested in knowing if humans are also fooled like CNNs are fooled? – nbro Jan 14 '23 at 22:06
  • A common assumption is that a successful adversarial example should only fool the model, because otherwise it defeats its purpose – Sam Jan 15 '23 at 09:52
  • I am not sure this assumption is required, but maybe that's what people assume. It seems that you have 2 distinct questions: 1. why can we fool neural networks given that they seem to classify images in the same way as we do and we are not fooled, 2. if humans classify images as neural networks do, then why aren't we also fooled? – nbro Jan 15 '23 at 09:57
  • yes, but I'm discussing CNN type model in particular. As you can see from the reference I provided, CNN extracts hierarchical visual features at different layers and it extracts the patterns similarly to how human perception works, especially at the deeper layers (at least from what I understood) – Sam Jan 15 '23 at 10:14

0 Answers0