Transfer Learning in Deep Neural Networks - The easiest way to describe transfer learning is the use of previously acquired knowledge and skills in new learning or problem-solving situations.
This is the same way in which we humans learn something.Some simple examples would be -
- Know how to ride a motorbike ⮫ Learn how to ride a car
- Know how to play classic piano ⮫ Learn how to play jazz piano
In transfer learning, we reuse the weights of one or more layers from a pre-trained neural network model in our new model and either keeping the weights fixed, fine tuning them, or adapting the weights entirely when training the model.
I've used the concept of transfer learning in deep neural networks to train face images and do Face recognition. The pre-trained model that I've used is VGG16.
I've divided this project into 2 parts-
1) Creating face dataset using Open-CV.
2) Training the VGG16 model by transfer learning and testing it.
Here's the link of my GitHub repository-
https://github.com/aayushi1908/Face-Recognition-by-TRANSFER-LEARNING.git
1) For dataset creation- I've used Open-CV library and Haar Cascade classifiers for detecting the face. The dataset has been divided into two sets- train set and test set. For train set, the code captures 1000 images of the person's face. For the test set, the code captures 500 images of the person's face.
About Haar Cascade - Haar Cascade is a machine learning object detection algorithm used to identify objects in an image or video. The Haar Cascade is trained by superimposing the positive image over a set of negative images. The training is generally done on a server and on various stages.
Here are the screenshots of the datasets that I've created for me and my friend's faces-
2) For training of VGG16 and testing the model -
Firstly I've imported the VGG16 model and then its weights from ImageNet. In my new model, I have freezed all the intermediate layers of VGG except the Input and Output layer. I've trained the output layer according to my requirement.
About VGG16 - VGG16 is a convolutional neural network model proposed by K. Simonyan and A. Zisserman from the University of Oxford. The model achieves 92.7% top-5 test accuracy in ImageNet, which is a dataset of over 14 million images belonging to 1000 classes. It was one of the famous model submitted to ILSVRC-2014. It makes the improvement over AlexNet by replacing large kernel-sized filters (11 and 5 in the first and second convolutional layer, respectively) with multiple 3×3 kernel-sized filters one after another.
With the help of transfer learning, I'm able to achieve an optimum accuracy of 94.70% without use of GPU or more desired RAM.
For Testing, any image provided to the model can be predicted if the image is of either of the two person's faces. The testing can also be done through the Web Cam of the laptop/PC.
Comments
Post a Comment