Face recognition – can we identify “Boy” from “Alien”?

The question is can we identify “Boy” from “Alien”?

Face Recognition addresses “who is this identity” question. This is a 1:K matching problem.  We have a database of K faces we have to identify whose image is the give input image.

Facenet is Tensorflow implementation of the face recognizer described in the paper “FaceNet: A Unified Embedding for Face Recognition and Clustering”.

FaceNet learns a neural network that encodes a face image into a vector of 128 numbers.  By comparing two such vectors, we can then determine if two pictures are of the same identity. FaceNet is trained by minimizing the triplet loss. For more information on triplet loss refer https://machinelearning.wtf/terms/triplet-loss/

Since training requires a lot of data and a lot of computation, I haven’t trained it from scratch here.

I have used previously trained model. I have taken the inception networks model implementation and weights from 4th course deeplearning.ai “Convolutional Neural Networks” from Coursera.

The network architecture follows the Inception model from [Szegedy *et al.](https://arxiv.org/abs/1409.4842).

More details about inception v1 is in this blog https://www.analyticsvidhya.com/blog/2018/10/understanding-inception-network-from-scratch/

This network uses 96×96 dimensional RGB images as its input. It encodes each input face image into a 128-dimensional vector.

First, for each image of “Alien” and “Boy” (I have taken 52 images of each), I converted them into encoding and stored into a database.

Here is the code that does that

cat1

What happens when Alien and Boy will pass through our image recognition system?

For each of the images of “Alien” and “Boy”, first compute the target encoding of the image from image path. Find the encoding from the database that has smallest distance with the target encoding.

If minimum distance (L2 distance between the target “encoding” and the current “encoding” from the database) is greater than 0.7 we assume the face is not in the database.

cat2

 

When Alien tries to pass through our face recognition system

 

Input Test Image of Alien Result Closest image
Alien1a Alien Alien1b
Alien-2a Alien Alien-2b
Alien-3a Boy

wrong

Alien-3b
Alien-4a Alien Alien-4b
Alien-5a Alien Alien-5b
Alien-6a Alien Alien-6b

Note that there is no image in the database like the green eyed image. distance is 0.5105655. So ay  be we can keep a cut off at 0.5 instead of 0.7

 

When Boy tries to pass through our face recognition system

 

Input Test Image of Boy Result Closest image
Boy1a Alien

wrong

Boy1bWrong
Boy2a Boy

 

Boy2b
Boy3a Boy Boy3b
Boy4a Alien

wrong

Boy4bWrong
Boy5a Alien

wrong

Boy5bWrong
Boy6a Boy Boy6b

Results look pretty good.

Summary

  • We should re-train facenet with Alien and Boy pictures to get better results.
  • Image dimensions were only 96×96 so that could have thrown a lot of information away
  • Model was trained on human faces which has different embeddings than cats
  • I have split database images and final images based on dates on which pictures were taken assuming pictures of the same dates must be similar. On inspection, I found that in the cases were the final images are very different from images we added in database that is they were never seen before, the results are incorrect. This can be fixed by adding more different types of images in the database.

 

Code Generation using LSTM (Long Short-term memory) RNN network

recurrent neural network (RNN) is a class of neural network that performs well when the input/output is a sequence. RNNs can use their internal state/memory to process sequences of inputs.

Neural Network models are of various kinds

  • One to one: Image classification where we give an input image and it returns a class to which the image belongs to.
  • One to Many: Image Captioning where input is a picture and output is a sentence describing the picture.
  • Many to One: Sentimental Analysis where input is a tweet and the output is a class like positive or negative.
  • Many to Many: Sequence to sequence model with Encoder – Decoder architecture: Language translation model where input is a sentence (let’s say in English) and output is a sentence in another language (let’s say French).

There are two popular variants of RNNs

We should try both to see which one is performing better for the problem we are trying to solve.

In this blog I have tried to generate new source code using LSTM. Here are the steps

Import required packages

cg1

Then set EPOCH and Batch size. These should be tuned properly.

cg2

In preprocessing stage, I have downloaded Openssl source code from github and concatenated all .c files into a file called “train.txt”. I was getting out of memory so I just took 1/3rd Openssl files. We can improve this code to load the source code in batches. We have to create a vocab list in preprocessing stage and saving it into a file and reading the file.

cg3

I have used character based model. We can made word based model also. We can use word embedding layer also which will be needed when we have more difficult problem sets.

I have used 2 LSTM layers with Dropout of 0.2 each and a Dense in the end with softmax. We can try different models and compare.

cg4

Visualize the model as shown below

cg5

Training for 10 epochs. As you can see loss is coming down gradually in every epoch from 2.97 to 1.55.

Here is the output it generated. We have given it a random starting point

cg6

As you can see it has done a very good job. It has returned values from a function based on if condition and start another function.

Here is the code in github. Please try it out and see.

References

 

 

 

Plant Seedlings Classification using Keras

This blog is dedicated to my friends who want to learn AI/ML/deep learning.

Explore Plant Seedling Classification dataset in Kaggle at the link https://www.kaggle.com/c/plant-seedlings-classification. It has training set images of 12 plant species seedlings organized by folder. Each image has a filename that is its unique id. The goal of the competition is to create a classifier capable of determining a plant’s species from a photo. Test set we need to predict the species of each image.

You can download this code from here.

Start a new Kernel. First import all the required python modules

plant1

We can look at the contents of ../input/train directory to see what it contains. Create two functions that converts string classes of plant seedlings into integer and reverse. This is for beautification only.

plant2

plant3

Then we set the parameters of the model like Epoch, Learning rate, Batch size. The more we tune these the better the results will be.

In training neural network, one epoch means one pass of the full training set.  Batch size refers to the number of training examples utilized in one iteration.  Here is a blog that explains learning rate.

plant4

 

Then we read the training data images. We resize all images into 128*128.

plant5

Then we create model we user 3 layers with activation function ReLU and in the last layer add a “softmax” layer.

In the context of artificial neural networks, the rectifier is an activation function. It enables better training of deeper networks,compared to the widely used activation functions prior to 2011, i.e., the logistic sigmoid and its more practical counterpart, the hyperbolic tangent. The rectifier is, as of 2018, the most popular activation function for deep neural networks.  A unit employing the rectifier is also called a rectified linear unit (ReLU).

The softmax function is often used in the final layer of a neural network-based classifier. Such networks are commonly trained under a log loss (or cross-entropy) regime, giving a non-linear variant of multinomial logistic regression.

We have used loss function is categorical cross-entropy function and Adam Optimizer.

plant6

Then we read training data partition into 75:25 split,  compile the model and save it. We also used image augmentation. We have added Image Data Generator to generate more images by slightly shifting the current images.

plant7

Next step is to generate matplotlib plots and read test data

plant8

The output of this is shown below :

PlantOut1

Next step is to create the CSV file for test data and upload it to the competition.

plant9

References

Simple Neural Network Model using Keras and Grid Search HyperParametersTuning

In this blog, I have explored using Keras and GridSearch and how we can automatically run different Neural Network models by tuning  hyperparameters (like epoch, batch sizes etc.).

I have used Jupyter Notebook for development.

Data set is UCI Cerdit Card Dataset which is available in csv format. Download the dataset from Kaggle https://www.kaggle.com/uciml/default-of-credit-card-clients-dataset  Description of fields is in https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients

First step is to import relevant packages and load CSV file contents into dataframe. Check the shape of the data frame and inspect the column names.

1-keras

The output shown below

1out--keras

Dataset has 30000 records and 25 columns. We know that “ID” column is not relevant for modelling so we can remove it. We have put rest of the columns into an array called “X”. The output is in column name “default.payment.next.month” so save it in variable called “Y”.

2-keras

We have a function to create a model. For now I have used simple parameters. But we can fine tune it by adding more layers etc.

2.5-kerasthen we create a model and try to set some parameters like epoch, batch_size in the Grid Search.

3-keras

As we can see from the output window that above various combinations of epoch and batch_sizes were run. For now I have kept epoch very small because it was taking time. We should test higher values also.

We try to figure out when we get the best scores

4-keras

As you can see in the output given above the best score we got was when we use epoch 1 and batch size of 5000. We can try different parameters like different values of activation functions, momentum, learning rates, drop out rates, weight constraints, number of neurons, initializers, optimizer functions.

Full code is available here

Simplest keras code for MNIST dataset is here

Coding is very simple and easier if you use keras package.

References