1

enter image description here

As far as I know,

  1. FaceNet requires a square image as an input.

  2. MTCNN can detect and crop the original image as a square, but distortion occurs.

Is it okay to feed the converted (now square) distorted image into FaceNet? Does it affect the accuracy of calculation similarity (embedding)?

For similarity (classification of known faces), I am going to put some custom layers upon FaceNet.

(If it's okay, maybe because every other image would be distorted no matter what? So, it would not compare normal image vs distorted image, but distorted image vs distorted image, which would be fair?)

Original issue: https://github.com/timesler/facenet-pytorch/issues/181.

nbro
  • 39,006
  • 12
  • 98
  • 176
jjangga
  • 111
  • 1
  • You don't have to crop out the detection box as-is, you could crop out a square centred on the box. Have you tried that for comparison? – Neil Slater Oct 23 '21 at 09:03
  • @NeilSlater Yeah right, but the problem is, if I do so, unnecessary portions like hair will be included. For example, in general, a human's face is vertically long. If I crop the image in a square without distortion, then it's very likely the hair would be included. But a same person(especially woman) can have long or short hair depending on images. And I wonder if the calculation of embedding with facenet would be affected by such a thing. Actually, facenet expects "only face" in a square format as an input, and I've seen examples of distortion to make face a square. Would it be okay? – jjangga Oct 24 '21 at 04:13
  • 1
    I don't know if it is OK or not, it depends on what inputs the model was trained on. You either need advice from someone who has used the specific model that you are using, or you need to try the different approaches that you are concerned about and measure their accuracy. There is no general rule covering all image classifiers. – Neil Slater Oct 24 '21 at 07:43

1 Answers1

1

As Neil Slater said it depends on how the model was trained. Now if you go to the FaceNet implementation in TF in github you can see that in the face alignment they do resize without aspect ratio keeping (old scipy.misc.imresize does not keep aspect ratio) so if you are using this implementation the answer to your question is: it does not affect the accuracy.

But if we want to elaborate the golden rule is: not to change the face aspect ratio because that is causing a distortion of the input data for the embedding computation.

In most face recognition tasks once you detect a bounding box you do a resize with aspect ratio keeping. You can accomplish that with several libraries:

I would recommend to use resize with aspect ratio keeping.

JVGD
  • 1,088
  • 1
  • 6
  • 14