11

FaceNet uses a novel loss metric (triplet loss) to train a model to output embeddings (128-D from the paper), such that any two faces of the same identity will have a small Euclidean distance, and such that any two faces of different identities will have a Euclidean distance larger than a specified margin. However, it needs another mechanism (HOG or MTCNN) to detect and extract faces from images in the first place.

Can this idea be extended to object recognition? That is, can an object detection framework (e.g. MaskR-CNN) be used to extract bounding boxes of an object, cropping the object feeding this to a network that was trained on triplet loss, and then compares the embeddings of objects to see if they’re the same object?

Is there any research that has been done or any published public datasets for this?

cngzz1
  • 49
  • 7
  • Check this page out, it describes how to develop apply triplet loss to a network: https://towardsdatascience.com/image-similarity-using-triplet-loss-3744c0f67973 – user784446 Jul 19 '20 at 17:57

0 Answers0