1

After reading about YOLO V3 and Faster R-CNN, I don't understand why the weights for the regression head aren't the same across all boxes of the same size. Given that the backbone of these systems is fully convolutional, the location of the outputted features should only depend upon the local region of the image which telescops to that feature map. Given that we want the object detector to behave the same way regardless of the object location in the image, shouldn't the weights be the same across anchors of the same size?

nbro
  • 39,006
  • 12
  • 98
  • 176
FourierFlux
  • 783
  • 1
  • 4
  • 14

0 Answers0