Write in the front: this article is also borrowed from some of the former code and ideas written, some of the code is also reused by others.
Let's start with the thought process:
1. First use Opencv to detect the region of the face
2. After successfully detecting the face region, the recognized face region is intercepted as an image and stored for subsequent training data.
3. after acquiring enough data, build a CNN network for training.
5. After the training is completed, the model is stored. 6.After using Opencv to read the video stream in real time, the region where the face is detected is turned into a picture and put into the model for prediction.
The above is a basic idea of what this project is all about.
1. The use of Opencv to detect the face of the code is as follows, this code in Opencv's official documents are also available, the most important thing is to load the xml file, because this xml file is to save this has been trained face detection model.
import cv2 def identify_face(window_name, camera_idx): (window_name) # Video source, either from a saved video or directly from a USB camera cap = (camera_idx) # Tell OpenCV to use the face recognition classifier classfier = ("haarcascade_frontalface_alt2.xml") # Color of the border to be drawn after the face is recognized, RGB format color = (0, 255, 0) while (): # Whether to initialize the camera device ok, frame = () # Read a frame of data if not ok: break # Convert the current frame to grayscale grey = (frame, cv2.COLOR_BGR2GRAY) # Face detection, 1.2 and 4 are the image scaling and the number of valid points to be detected, respectively. faceRects = (grey, scaleFactor=1.2, minNeighbors=4, minSize=(32, 32)) if len(faceRects) > 0: # Greater than 0 then face detected for faceRect in faceRects: # Frame each face individually x, y, w, h = faceRect # Get the coordinates corresponding to the upper-left corner of the face, as well as the width and height (frame, (x - 10, y - 10), (x + w + 10, y + h + 10), color, 2) # Display image (window_name, frame) c = (10) if c & 0xFF == ord('q'): break # Release the camera and destroy all windows () () if __name__ == '__main__': identify_face("identify face", 0)
2. After the face area is detected, the face area is intercepted and saved as an image
import cv2 from threading import Thread def identify_face_and_store_face_image(window_name, camera_idx): (window_name) # Video source, either from a saved video or directly from a USB camera cap = (camera_idx) # Tell OpenCV to use the face recognition classifier classfier = ("haarcascade_frontalface_alt2.xml") # Color of the border to be drawn after the face is recognized, in RGB format color = (0, 255, 0) # Indexes where pictures are kept num = 0 while (): # Whether to initialize the camera device ok, frame = () # Read a frame of data if not ok: break # Convert the current frame to grayscale grey = (frame, cv2.COLOR_BGR2GRAY) # Face detection, 1.2 and 2 are the image scaling and the number of valid points to be detected respectively faceRects = (grey, scaleFactor=1.2, minNeighbors=4, minSize=(32, 32)) if len(faceRects) > 0: # Greater than 0 then face detected for faceRect in faceRects: # Frame each face individually x, y, w, h = faceRect # Get the coordinates corresponding to the upper-left corner of the face, as well as the width and height (frame, (x - 10, y - 10), (x + w + 10, y + h + 10), color, 2) # store_face_image(frame, h, num, w, x, y) # Start a thread to store the face images # t = Thread(target=store_face_image, args=(frame, h, num, w, x, y, )) () # Shows how many face shots were captured font = cv2.FONT_HERSHEY_SIMPLEX # Fonts (frame, ('num %d' % num), (x + 30, y + 30), font, 1, (255, 0, 255), 2) num += 1 if num <= 1000: # Save 1,000 images and quit break if num >= 1000: break # Display image (window_name, frame) c = (10) if c & 0xFF == ord('q'): break # Release the camera and destroy all windows () () def store_face_image(frame, h, num, w, x, y): # Save the current frame as an image img_name = '%s/%' % (r'face_image', num) image = frame[y - 10: y + h + 10, x - 10: x + w + 10] (img_name, image) if __name__ == '__main__': identify_face_and_store_face_image("identify face", 0)
3. The saved data will be processed, such as tagging, normalization, etc. The following code is in the load_datasets.py file
import os import sys import numpy as np import cv2 IMAGE_SIZE = 64 # Resize to specified image size def resize_image(image, height=IMAGE_SIZE, width=IMAGE_SIZE): top, bottom, left, right = (0, 0, 0, 0) # Get image size h, w, _ = # For pictures with unequal lengths and widths, find the longest side longest_edge = max(h, w) # Calculate the short side by adding more pixels to make it as long as the long side. if h < longest_edge: dh = longest_edge - h top = dh // 2 bottom = dh - top elif w < longest_edge: dw = longest_edge - w left = dw // 2 right = dw - left else: pass # RGB colors BLACK = [0, 0, 0] # Add a border to the image, is the image length, width and so on, cv2.BORDER_CONSTANT specify the border color specified by value constant = (image, top, bottom, left, right, cv2.BORDER_CONSTANT, value=BLACK) # Resize the image and return return (constant, (height, width)) # Read the training data images = [] labels = [] def read_path(path_name): for dir_item in (path_name): # Stacked from initial paths and merged into recognizable operation paths full_path = ((path_name, dir_item)) if (full_path): # If it's a folder, continue with the recursive call read_path(full_path) else: # Documentation if dir_item.endswith('.jpg'): image = (full_path) image = resize_image(image, IMAGE_SIZE, IMAGE_SIZE) # Let go of this code to see the actual effect of the resize_image() function call # ('', image) (image) (path_name) return images, labels # Read training data from a specified path def load_dataset(path_name): images, labels = read_path(path_name) # Convert all the input images into a four-dimensional array of size (number of images * IMAGE_SIZE * IMAGE_SIZE * 3) # 587 images in total, IMAGE_SIZE is 64, so for me the size is 587 * 64 * 64 * 3 # Image is 64 * 64 pixels, 3 color values per pixel (RGB) images = (images) print() # Labeled data, 'my_face_image' folder are my face images, all specified as 0, you can get other people's face images such as classmates, specified as 1, # Labeled data, 'my_face_image' folder are my face images, all specified as 0, you can get other people's face images such as classmates, specified as 1 labels = ([0 if ('my_face_image') else 1 for label in labels]) return images, labels if __name__ == '__main__': if len() != 1: print("Usage:%s path_name\r\n" % ([0])) else: images, labels = load_dataset("face_image")
4. Build the model and train it
import random import numpy as np from sklearn.model_selection import train_test_split from import ImageDataGenerator from import Sequential from import Dense, Dropout, Activation, Flatten from import Convolution2D, MaxPooling2D from import SGD from import np_utils from import load_model from keras import backend as K from load_datasets import load_dataset, resize_image, IMAGE_SIZE class Dataset: def __init__(self, path_name): # Training set self.train_images = None self.train_labels = None # Validation sets self.valid_images = None self.valid_labels = None # Test sets self.test_images = None self.test_labels = None # dataset load path self.path_name = path_name # The order of dimensions used by the current library self.input_shape = None # Load the dataset and divide the dataset according to the principle of cross-validation and perform the related pre-processing work def load(self, img_rows=IMAGE_SIZE, img_cols=IMAGE_SIZE, img_channels=3, nb_classes=2): # Load dataset into memory images, labels = load_dataset(self.path_name) train_images, valid_images, train_labels, valid_labels = train_test_split(images, labels, test_size=0.3, random_state=(0, 100)) _, test_images, _, test_labels = train_test_split(images, labels, test_size=0.5, random_state=(0, 100)) # If the current dimension order is 'th', then the order when inputting image data is: channels,rows,cols, otherwise:rows,cols,channels # This is the part of the code that reorganizes the training dataset according to the order of dimensions required by the keras library. if K.image_dim_ordering() == 'th': train_images = train_images.reshape(train_images.shape[0], img_channels, img_rows, img_cols) valid_images = valid_images.reshape(valid_images.shape[0], img_channels, img_rows, img_cols) test_images = test_images.reshape(test_images.shape[0], img_channels, img_rows, img_cols) self.input_shape = (img_channels, img_rows, img_cols) else: train_images = train_images.reshape(train_images.shape[0], img_rows, img_cols, img_channels) valid_images = valid_images.reshape(valid_images.shape[0], img_rows, img_cols, img_channels) test_images = test_images.reshape(test_images.shape[0], img_rows, img_cols, img_channels) self.input_shape = (img_rows, img_cols, img_channels) # of output training sets, validation sets, test sets print(train_images.shape[0], 'train samples') print(valid_images.shape[0], 'valid samples') print(test_images.shape[0], 'test samples') # Our model uses categorical_crossentropy as the loss function and therefore needs to be partitioned according to the number of classes nb_classes # category tags are one-hot coded to make them vectorized, here we have only two categories, after transformation the tagged data becomes two-dimensional train_labels = np_utils.to_categorical(train_labels, nb_classes) valid_labels = np_utils.to_categorical(valid_labels, nb_classes) test_labels = np_utils.to_categorical(test_labels, nb_classes) # Pixel data floats in order to normalize it train_images = train_images.astype('float32') valid_images = valid_images.astype('float32') test_images = test_images.astype('float32') # Normalize it, each pixel value of the image is normalized to the interval 0~1 train_images /= 255 valid_images /= 255 test_images /= 255 self.train_images = train_images self.valid_images = valid_images self.test_images = test_images self.train_labels = train_labels self.valid_labels = valid_labels self.test_labels = test_labels # CNN network model classes class Model: def __init__(self): = None # Modeling def build_model(self, dataset, nb_classes=2): # Construct an empty network model, which is a linear stacked model, where the individual neural network layers are added sequentially, professionally known as a sequential model or linear stacked model = Sequential() # The following code will sequentially add the layers needed for the CNN network, an add is a network layer (Convolution2D(32, 3, 3, border_mode='same', input_shape=dataset.input_shape)) # 1 2-dimensional convolutional layer (Activation('relu')) # 2 Activation function layer (Convolution2D(32, 3, 3)) # 3 2-dimensional convolutional layers (Activation('relu')) # 4 Activation function layer (MaxPooling2D(pool_size=(2, 2))) # 5 Pooling layer (Dropout(0.25)) # 6 Dropout layer (Convolution2D(64, 3, 3, border_mode='same')) # 7 2-dimensional convolutional layers (Activation('relu')) # 8 Activation function layer (Convolution2D(64, 3, 3)) # 9 2-dimensional convolutional layers (Activation('relu')) # 10 Activation function layer (MaxPooling2D(pool_size=(2, 2))) # 11 Pooling layer (Dropout(0.25)) # 12 Dropout layer (Flatten()) # 13 Flatten layers (Dense(512)) # 14 The Dense Layer, also known as the Fully Connected Layer. (Activation('relu')) # 15 Activation function layer (Dropout(0.5)) # 16 Dropout layer (Dense(nb_classes)) # 17 Dense layer (Activation('softmax')) # 18 Classification layer, output final results # Output model profiles () # Training models def train(self, dataset, batch_size=20, nb_epoch=10, data_augmentation=True): sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True) # Use SGD+momentum's optimizer for training, first generate an optimizer object (loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy']) # Actual model configuration work completed # Without using data boosting, so-called boosting is the use of rotating, flipping, adding noise, etc. from the training data we provide to create new # Training data, consciously boosting the size of the training data and increasing the amount of model training if not data_augmentation: (dataset.train_images, dataset.train_labels, batch_size=batch_size, nb_epoch=nb_epoch, validation_data=(dataset.valid_images, dataset.valid_labels), shuffle=True) # Use of real-time data to enhance else: # Define a data generator to be used for data lifting, which returns a generator object datagen, which is invoked every # Subsequent to its generation of a set of data (sequential generation), saving memory, is actually python's data generator datagen = ImageDataGenerator( featurewise_center=False, # Whether to decentralize the input data (mean value 0). samplewise_center=False, # Whether or not to make each sample mean of the input data 0 featurewise_std_normalization=False, # Whether data are standardized (input data divided by the standard deviation of the data set) samplewise_std_normalization=False, # Whether to divide each sample data by its own standard deviation zca_whitening=False, # Whether or not to apply ZCA whitening to the input data rotation_range=20, # Angle of random rotation of the image during data lifting (range 0 to 180) width_shift_range=0.2, # Amplitude of the horizontal offset of the image during data lifting (in units of a percentage of the image width, a floating point number between 0 and 1) height_shift_range=0.2, # Same as above, except here it's vertical horizontal_flip=True, # Whether to perform random horizontal flips vertical_flip=False) # Whether to perform random vertical flips # Calculate the number of the entire training sample set for processing such as eigenvalue normalization, ZCA whitening, etc. (dataset.train_images) # Start training the model with the generator .fit_generator((dataset.train_images, dataset.train_labels, batch_size=batch_size), samples_per_epoch=dataset.train_images.shape[0], nb_epoch=nb_epoch, validation_data=(dataset.valid_images, dataset.valid_labels)) MODEL_PATH = '.h5' def save_model(self, file_path=MODEL_PATH): (file_path) def load_model(self, file_path=MODEL_PATH): = load_model(file_path) def evaluate(self, dataset): score = (dataset.test_images, dataset.test_labels, verbose=1) print("%s: %.2f%%" % (.metrics_names[1], score[1] * 100)) # Recognize faces def face_predict(self, image): # Still determining the order of dimensions based on the back-end system # if K.image_dim_ordering() == 'th' and != (1, 3, IMAGE_SIZE, IMAGE_SIZE): image = resize_image(image) # Size must be consistent with training set should be IMAGE_SIZE x IMAGE_SIZE image = ((1, 3, IMAGE_SIZE, IMAGE_SIZE)) # Unlike model training, this time the prediction is only for 1 image elif K.image_dim_ordering() == 'tf' and != (1, IMAGE_SIZE, IMAGE_SIZE, 3): image = resize_image(image) image = ((1, IMAGE_SIZE, IMAGE_SIZE, 3)) # Floating point and normalized image = ('float32') image /= 255 # Given the probability that the input belongs to each category, and we are in the binary category, what is the probability that the input image belongs to each of the categories 0 and 1 that this function will give us result = .predict_proba(image) print('result:', result) # Given a category prediction: 0 or 1 result = .predict_classes(image) # Return category forecast results return result[0] if __name__ == '__main__': # Training models dataset = Dataset('face_image') () model = Model() model.build_model(dataset) # Code to test the training function (dataset) if __name__ == '__main__': # Train and save the model dataset = Dataset('face_image') () model = Model() model.build_model(dataset) (dataset) model.save_model(file_path='model/.h5') if __name__ == '__main__': # Read the model for evaluation dataset = Dataset('face_image') () # Assessment models model = Model() model.load_model(file_path='model/.h5') (dataset)
5. After training the model, use Opencv to read the video stream in real time to detect the position of the face, and then put the face into the model for prediction
# -*- coding: utf-8 -*- import cv2 import sys from face_train import Model if __name__ == '__main__': if len() != 1: print("Usage:%s camera_id\r\n" % ([0])) (0) # Loading models model = Model() model.load_model(file_path='model/.h5') # Color of the rectangular border that frames the face color = (0, 255, 0) # Capture a live video stream from a specified camera cap = (0) # Face Recognition Classifier Local Storage Paths cascade_path = "haarcascade_frontalface_alt2.xml" # Loop detection to recognize faces while True: ret, frame = () # Read a frame of video if ret is True: # Image graying to reduce computational complexity frame_gray = (frame, cv2.COLOR_BGR2GRAY) else: continue # Use face recognition classifiers, read in classifiers # cascade = (cascade_path) # Use a classifier to identify which area is a face faceRects = (frame_gray, scaleFactor=1.2, minNeighbors=3, minSize=(32, 32)) if len(faceRects) > 0: for faceRect in faceRects: x, y, w, h = faceRect # Capture an image of a face and submit it to a model to recognize who it is # image = frame[y - 10: y + h + 10, x - 10: x + w + 10] faceID = model.face_predict(image) # If it's "I" if faceID == 0: (frame, (x - 10, y - 10), (x + w + 10, y + h + 10), color, thickness=2) # Who's the text cue? (frame, 'zhuhaipeng', (x + 30, y + 30), # Coordinates cv2.FONT_HERSHEY_SIMPLEX, # Fonts 1, # Font size (255, 0, 255), # Color 2) # Line width of the word else: # If it wasn't for me # # Text cues unknown (frame, 'Unknown people ', (x + 30, y + 30), # Coordinates cv2.FONT_HERSHEY_SIMPLEX, # Fonts 1, # Font size (255, 0, 255), # Color 2) # Line width of the word ("identify me", frame) # Wait 10 milliseconds to see if there is a keystroke input k = (10) # Exit loop if q is entered if k & 0xFF == ord('q'): break # Release the camera and destroy all windows () ()
To this point a simple face recognition small project is completed, in this project, just a simple two-classification, you can expand on this basis for multi-classification, if the accuracy of the recognition is low, you can try to change the network architecture, or preprocessing of data and so on, if you are interested in improving it, you can try.
This is the whole content of this article.