0

I have developed face recognition algorithms by using pre-built libraries in Python and open CV. However, suppose if I want to make my own neural network algorithm for face recognition, what are the steps that I need to follow?

I have just seen Andrew Ng's course videos (specifically, I watched 70 videos).

nbro
  • 39,006
  • 12
  • 98
  • 176
  • Please elaborate what do you mean by neural network algorithm. Do you mean the architecture, learning algorithm or the activity rule. It seems that you want to build an architecture, i guess. – naive Aug 18 '19 at 06:21

3 Answers3

0

My implementation to increase detection on video is using object tracking algorithms.

More specifically, first, I detect the object using a trained classifier. Then I track the object with the KCF algorithm. If the object tracker misses the object, again I call for the classifier.

nbro
  • 39,006
  • 12
  • 98
  • 176
dasmehdix
  • 257
  • 1
  • 8
0

You could try building Siamese network and train it on a large set of faces.

Two identical networks are used; one taking the known signature for the person, and another taking a candidate signature. The outputs of both networks are combined and scored to indicate whether the candidate signature is real or a forgery. The deep CNN are first trained to discriminate between examples of each class. The models are then re-purposed for verification to predict whether new examples match a template for each class. Specifically, each network produces a feature vector for an input image, which are then compared using the L1 distance and a sigmoid activation. Similar goes with face.

Dennis Soemers
  • 9,894
  • 2
  • 25
  • 66
  • 1
    Hi. It's nice that you're trying to help, but we expect users to provide answers with more useful information. So, please, explain why Siamese networks are potentially useful to solve the task. – nbro Sep 07 '20 at 17:48
  • Two identical networks are used, one taking the known signature for the person, and another taking a candidate signature. The outputs of both networks are combined and scored to indicate whether the candidate signature is real or a forgery. The deep CNN are first trained to discriminate between examples of each class. .The models are then re-purposed for verification to predict whether new examples match a template for each class.Specifically, each network produces a feature vector for an input image, which are then compared using the L1 distance and a sigmoid activation.Similar goes with face – Pratheesh Kumar Sep 07 '20 at 18:03
  • 2
    @PratheeshKumar Welcome to AI Stack Exchange! - use [edit] to add those details to your existing answer. The goal is to create good questions and good answers that match, and you are encouraged to edit your answer to make it as good as it can be. The comments here are for us to discuss what would make the answer better. – Neil Slater Sep 07 '20 at 18:19
  • 1
    I've moved your comment into the answer itself, to make the answer as a whole a bit more complete (as suggested by Neil) – Dennis Soemers Sep 13 '20 at 10:23
0

For the Construction of Deep Learning Models

Backbone Deep Learning models which can be applied to a variety of deep learning tasks (including facial recognition) have been implemented in a range of libraries available in Python. I'm assuming by constructing your own algorithm you mean a novel implementation of the model structure. Taking the PyTorch framework as an example, some common pretrained models are available here:

https://github.com/pytorch/vision/tree/master/torchvision/models

To train a novel face recognition model you could follow the tutorial for object detection available here: https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html and make changes to the model.

In the tutorial they use model features from the library in the following section of code:

# load a pre-trained model for classification and return
# only the features
backbone = torchvision.models.mobilenet_v2(pretrained=True).features

For the simplest example torchvision.models.AlexNet.features look like this:

self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(64, 192, kernel_size=5, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(192, 384, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
        )

Adding or subtracting layers from this backbone feature extractor would result in a new "algorithm" for object detection. If you want to know exactly what mathematical operation each of these layers is performing you can look at the PyTorch documentation. For example, in the case of nn.Relu layer: https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html

Applies the rectified linear unit function element-wise:

$$ ReLU(x)=(x)^{+}=max(0,x)$$