2

Assume I have an input of size $32 \times 32 \times 3$ and pass it to a convolution layer. Now, if my kernel size were to be $5 \times 5 \times 3$ and the depth of my convolution layer were to be 1, only one feature map would be produced for the image. Here, each neuron would have $5 \times 5 \times 3 = 75$ weights (+1 bias).

If I wanted to calculate multiple feature maps in this layer, say 3, is each local section (in this example, $5 \times 5 \times 3$) of the image looked on by three different neurons and each of their weights trained individually? And what would be the output volume of this layer?

nbro
  • 39,006
  • 12
  • 98
  • 176

1 Answers1

2

Each feature map (or kernel) is independent of each other. If you had $3$ of these filters, your output shape would be $(28, 28, 3)$ (given the appropriate amount of padding and stride) with a total of $75*3=225$ trainable weights.

nbro
  • 39,006
  • 12
  • 98
  • 176
mshlis
  • 2,349
  • 7
  • 23