1

I have an input tensor of shape $\mathbf{(3, 32, 32)}$ consisting of 3 channels, 16 rows, and 16 columns. I want to convolve the input tensor using $\mathbf{(3 \times 3)}$ kernel/filter. How can I calculate the required FLOPs?

nbro
  • 39,006
  • 12
  • 98
  • 176
Mhasan502
  • 13
  • 4

1 Answers1

1

Each output pixel channel is a 3x3x3 filter, so 27 inputs which get multiplied by 27 weights and then added together. This is 27 FMA (fused-multiply-add) operations, or 27 multiply operations and 26 additions. I believe all modern devices implement FMA.

The number of output pixel channels is 30x30x3 = 2700 (as a 3x3 kernel shaves off one pixel on each edge) and each one takes 27 operations to calculate. So that's 72900 operations in total.

user253751
  • 922
  • 3
  • 11