Fun With Facial Recognition
A gentle introduction to what it is and how it works
If you’ve ever wanted to put together a neat project to show your friends or coworkers but find all the math behind facial recognition unapproachable, then hopefully this guide will serve as a primer to some basic ideas behind facial detection and recognition and give you a solid jumping off point for your projects
Topics to Cover
- Facial Detection — what constitutes a face and how do we identify that numerically
- Facial Recognition — Ok we found some faces, but who do they belong to
- Demo Setup — Diving in with a simple example
Facial Detection
In this demo, we’ll use OpenCV. OpenCV is a free to use computer vision library primarily used in C++ or Python. One of the simpler facial detection modes it offers leverages the Haar Cascading Classifier.
In the Haar Classifier, each of these squares represents a 3x3 matrix of pixels where white is, for instance, 0 and black is 5. The square is placed on top of an image from a camera feed and typically scans each pixel, multiplying this 3x3 with the 3x3 chunk of the image to identify whether or not that chunk of the image is closely related to any of these 4 features. The idea here is that these features approximate a face like so.
Since most of the image is not a face, it helps to discard swaths of the image that are identified as “not-face” early on. The cascading part of the cascading classifier divides the image into 24x24 pixel squares and applies the feature matrices to each square. Once it determines after the first few features that the square contains no face, it discards the square and moves on.
When the OpenCV Haar classifier is trained, it was given thousands of images of faces and no-faces. After applying the Haar features in a sliding window technique, it uses Adaptive Boosting or AdaBoost to determine which feature combinations best represent a face.
AdaBoost from 10,000 feet
In adaptive boosting, each face is represented as a row of data. The columns are represented, simply put, as the output of the matrix multiplication of each chunk of the image and those 4 filters. Each feature is represented as a decision tree with one node (or stump if you will) where the decision to be made is whether or not the face was greater than or less than a certain cutoff for the closeness to the given Haar feature.
As face/not-face is predicted on all training data, the stumps/features that are correct more often are weighed more heavily in the next prediction until each feature has a weight representing how good it is at detecting a face. For example, features 2 and 3 in the above features image may be weighted more heavily than features 1 and 4.
When we use the Haar face classifier, we’re rapidly applying all those features to our image feed and using all the weighted decision stumps from training, it is able to classify whether or not our feed contains a face.
Facial Recognition
Now that we’ve identified a face from the feed, we need a way to figure out who it belongs to. Normally, this is done by labeling the face, taking a bunch of pictures of the face, and computing something called the Local Binary Pattern Histogram (LBPH for short).
A sliding window is placed over the area identified to be a face by our Haar Cascade Classifier and we convert each pixel into a white or black pixel based on how different they are from the central pixel in a 3x3 filter.
Repeating this process results in a histogram for each 3x3 chunk, and a final concatenation of all those chunks for a final “signature”
Also the image looks all weird
With a trained recognition algorithm, we’re able to compare incoming video feeds, which get turned into their own histograms, to the histograms we know about from our training. The distance off, is measured using the Euclidian distance formula and the closest match between the input image and the trained images is identified appropriately.
Implementing For Yourself
Lastly, this step will walk through what you can do to get set up yourself. This assumes you have OpenCV installed on your machine and your environment configured to use Python 3. You’ll need to install jupyter if you haven’t already. The jupyter notebook is available here and you can run it by navigating to the project directory and running jupyter notebook
Once in your browser at localhost:8888, navigate to the Facial Recognition Demo notebook.
This notebook is broken down into 4 main phases. The setup phase where we import the requisite packages and initialize the camera configuration, the face gathering phase where we take 30 snapshots of the face for training the LBPH recognizer, the training of the LBPH recognizer, and finally using it to recognize the faces. The first part is explanatory so I wont spend any time on it.
The second part is where it gets interesting. You’ll see this lineface_cascade = cv2.CascadeClassifier('./classifiers/haar_frontal_face.xml')
here which is where we set up our Cascade classifier to use the Haar facial features. There are other features you can use to detect different types of objects using cascade classifiers, but for now let’s focus on faces. Cascade classifier is just the part where we disregard areas of the image that we know contain no faces early on and focus on testing more Haar features on areas that are more likely to contain a face. Once we initialize our cascade classifier, we ask for a name to set up the face id, name mapping to make things more user friendly.
The image capture process is pretty common in the code here (I know its repetitious, I meant for the cells to be runnable by themselves so I could use the code in later projects) we open the camera, convert the feed to grayscale, apply the facial recognizer, do stuff with the faces it detected, then close the camera.
The next block trains the classifier by opening each face image saved in your datasets
folder and builds the histograms. It then saves the histogram for each face as a row of training data so it can compare incoming feeds later.
Finally, the recognition block. You’ll may want to restart your notebook as mine kept holding onto the camera reference and forced me to restart. You can also run this block in a separate .py file in the same project directory. This block opens the camera, converts the feed to grayscale, detects the face, builds a histogram of that face and then draws a block around the region it thinks is a face, and if it’s your face, it’ll write your name on it.
Bringing it Together
Hopefully this intro to facial recognition was a fun and approachable way to understanding facial recognition and serves as a solid jumping off point for building your own facial recognition apps or diving into deeper aspects of ML and CV like CNNs and Adaptive Boosting. Hit me up on Twitter or LinkedIn if you have questions or comments. I don’t claim to be any sort of expert in this so if you have suggestions, I’m wide open. Thanks for reading!