Visualizing encoder-attention after ResNet in terms of ResNet input

Asked May 31 '21 at 10:46

Active Jun 01 '21 at 09:04

Viewed 161 times

I have a transform-encoder only architecture, which has the following structure:

      Input
        |
        v
     ResNet(-50)
        |
        v
fully-connected (on embedding dimension)
        |
        v
positional-encoding
        |
        v
transformer encoder
        |
        v
Linear layer to alphabet.

I am trying to visualize the self-attention of the encoder layer to check how each input of the attention attends other inputs. (E.g. https://github.com/jessevig/bertviz)

Where I encounter difficulty is in how I can visualize these activations in terms of the original input of the ResNet and not its output, in order to make my model visually interpretable.

Do you have any ideas or suggestions?

edited Jun 01 '21 at 09:04

nbro

39,006
12
98
176

asked May 31 '21 at 10:46

John Sig

Hello. Is this just a programming issue or are you asking to conceptually solve your problem? This is not clear from your post. – nbro Jun 01 '21 at 09:05
My question is conceptual but I plan to implement it. – John Sig Jun 01 '21 at 14:35

Visualizing encoder-attention after ResNet in terms of ResNet input

0 Answers0