SoFunction
Updated on 2024-11-14

Kears+Opencv for Simple Face Recognition

Write in the front: this article is also borrowed from some of the former code and ideas written, some of the code is also reused by others.

Let's start with the thought process:

1. First use Opencv to detect the region of the face

2. After successfully detecting the face region, the recognized face region is intercepted as an image and stored for subsequent training data.

3. after acquiring enough data, build a CNN network for training.

5. After the training is completed, the model is stored. 6.After using Opencv to read the video stream in real time, the region where the face is detected is turned into a picture and put into the model for prediction.

The above is a basic idea of what this project is all about.

1. The use of Opencv to detect the face of the code is as follows, this code in Opencv's official documents are also available, the most important thing is to load the xml file, because this xml file is to save this has been trained face detection model.

import cv2
 
def identify_face(window_name, camera_idx):
  (window_name)
 
  # Video source, either from a saved video or directly from a USB camera
  cap = (camera_idx)
 
  # Tell OpenCV to use the face recognition classifier
  classfier = ("haarcascade_frontalface_alt2.xml")
 
  # Color of the border to be drawn after the face is recognized, RGB format
  color = (0, 255, 0)
 
  while (): # Whether to initialize the camera device
    ok, frame = () # Read a frame of data
    if not ok:
      break
 
    # Convert the current frame to grayscale
    grey = (frame, cv2.COLOR_BGR2GRAY)
 
 
    # Face detection, 1.2 and 4 are the image scaling and the number of valid points to be detected, respectively.
    faceRects = (grey, scaleFactor=1.2, minNeighbors=4, minSize=(32, 32))
    if len(faceRects) > 0: # Greater than 0 then face detected
      for faceRect in faceRects: # Frame each face individually
        x, y, w, h = faceRect  # Get the coordinates corresponding to the upper-left corner of the face, as well as the width and height
        (frame, (x - 10, y - 10), (x + w + 10, y + h + 10), color, 2)
 
    # Display image
    (window_name, frame)
    c = (10)
    if c & 0xFF == ord('q'):
      break
 
  # Release the camera and destroy all windows
  ()
  ()
 
 
if __name__ == '__main__':
  identify_face("identify face", 0)

2. After the face area is detected, the face area is intercepted and saved as an image

import cv2
from threading import Thread
 
 
 
def identify_face_and_store_face_image(window_name, camera_idx):
  (window_name)
 
  # Video source, either from a saved video or directly from a USB camera
  cap = (camera_idx)
 
  # Tell OpenCV to use the face recognition classifier
  classfier = ("haarcascade_frontalface_alt2.xml")
 
  # Color of the border to be drawn after the face is recognized, in RGB format
  color = (0, 255, 0)
 
 
  # Indexes where pictures are kept
  num = 0
  while (): # Whether to initialize the camera device
    ok, frame = () # Read a frame of data
    if not ok:
      break
 
    # Convert the current frame to grayscale
    grey = (frame, cv2.COLOR_BGR2GRAY)
 
 
    # Face detection, 1.2 and 2 are the image scaling and the number of valid points to be detected respectively
    faceRects = (grey, scaleFactor=1.2, minNeighbors=4, minSize=(32, 32))
    if len(faceRects) > 0: # Greater than 0 then face detected
      for faceRect in faceRects: # Frame each face individually
        x, y, w, h = faceRect  # Get the coordinates corresponding to the upper-left corner of the face, as well as the width and height
 
        (frame, (x - 10, y - 10), (x + w + 10, y + h + 10), color, 2)
 
        # store_face_image(frame, h, num, w, x, y)
        # Start a thread to store the face images #
        t = Thread(target=store_face_image, args=(frame, h, num, w, x, y, ))
        ()
 
 
        # Shows how many face shots were captured
        font = cv2.FONT_HERSHEY_SIMPLEX # Fonts
        (frame, ('num %d' % num), (x + 30, y + 30), font, 1, (255, 0, 255), 2)
 
        num += 1
        if num <= 1000: # Save 1,000 images and quit
          break
 
 
    if num >= 1000:
      break
 
    # Display image
    (window_name, frame)
    c = (10)
    if c & 0xFF == ord('q'):
      break
 
  # Release the camera and destroy all windows
  ()
  ()
 
 
def store_face_image(frame, h, num, w, x, y):
  # Save the current frame as an image
  img_name = '%s/%' % (r'face_image', num)
  image = frame[y - 10: y + h + 10, x - 10: x + w + 10]
  (img_name, image)
 
if __name__ == '__main__':
  identify_face_and_store_face_image("identify face", 0)

3. The saved data will be processed, such as tagging, normalization, etc. The following code is in the load_datasets.py file

import os
import sys
import numpy as np
import cv2
 
IMAGE_SIZE = 64
 
 
# Resize to specified image size
def resize_image(image, height=IMAGE_SIZE, width=IMAGE_SIZE):
  top, bottom, left, right = (0, 0, 0, 0)
 
  # Get image size
  h, w, _ = 
 
  # For pictures with unequal lengths and widths, find the longest side
  longest_edge = max(h, w)
 
  # Calculate the short side by adding more pixels to make it as long as the long side.
  if h < longest_edge:
    dh = longest_edge - h
    top = dh // 2
    bottom = dh - top
  elif w < longest_edge:
    dw = longest_edge - w
    left = dw // 2
    right = dw - left
  else:
    pass
 
  # RGB colors
  BLACK = [0, 0, 0]
 
  # Add a border to the image, is the image length, width and so on, cv2.BORDER_CONSTANT specify the border color specified by value
  constant = (image, top, bottom, left, right, cv2.BORDER_CONSTANT, value=BLACK)
 
  # Resize the image and return
  return (constant, (height, width))
 
 
# Read the training data
images = []
labels = []
 
 
def read_path(path_name):
  for dir_item in (path_name):
    # Stacked from initial paths and merged into recognizable operation paths
    full_path = ((path_name, dir_item))
 
    if (full_path): # If it's a folder, continue with the recursive call
      read_path(full_path)
    else: # Documentation
      if dir_item.endswith('.jpg'):
        image = (full_path)
        image = resize_image(image, IMAGE_SIZE, IMAGE_SIZE)
 
        # Let go of this code to see the actual effect of the resize_image() function call
        # ('', image)
 
        (image)
        (path_name)
 
  return images, labels
 
 
# Read training data from a specified path
def load_dataset(path_name):
  images, labels = read_path(path_name)
 
  # Convert all the input images into a four-dimensional array of size (number of images * IMAGE_SIZE * IMAGE_SIZE * 3)
  # 587 images in total, IMAGE_SIZE is 64, so for me the size is 587 * 64 * 64 * 3
  # Image is 64 * 64 pixels, 3 color values per pixel (RGB)
  images = (images)
  print()
 
  # Labeled data, 'my_face_image' folder are my face images, all specified as 0, you can get other people's face images such as classmates, specified as 1, # Labeled data, 'my_face_image' folder are my face images, all specified as 0, you can get other people's face images such as classmates, specified as 1
  labels = ([0 if ('my_face_image') else 1 for label in labels])
 
  return images, labels
 
 
if __name__ == '__main__':
  if len() != 1:
    print("Usage:%s path_name\r\n" % ([0]))
  else:
    images, labels = load_dataset("face_image")

4. Build the model and train it

import random
 
import numpy as np
from sklearn.model_selection import train_test_split
from  import ImageDataGenerator
from  import Sequential
from  import Dense, Dropout, Activation, Flatten
from  import Convolution2D, MaxPooling2D
from  import SGD
from  import np_utils
from  import load_model
from keras import backend as K
 
from load_datasets import load_dataset, resize_image, IMAGE_SIZE
 
 
class Dataset:
  def __init__(self, path_name):
    # Training set
    self.train_images = None
    self.train_labels = None
 
    # Validation sets
    self.valid_images = None
    self.valid_labels = None
 
    # Test sets
    self.test_images = None
    self.test_labels = None
 
    # dataset load path
    self.path_name = path_name
 
    # The order of dimensions used by the current library
    self.input_shape = None
 
  # Load the dataset and divide the dataset according to the principle of cross-validation and perform the related pre-processing work
  def load(self, img_rows=IMAGE_SIZE, img_cols=IMAGE_SIZE,
       img_channels=3, nb_classes=2):
    # Load dataset into memory
    images, labels = load_dataset(self.path_name)
 
    train_images, valid_images, train_labels, valid_labels = train_test_split(images, labels, test_size=0.3,
                                         random_state=(0, 100))
    _, test_images, _, test_labels = train_test_split(images, labels, test_size=0.5,
                             random_state=(0, 100))
 
    # If the current dimension order is 'th', then the order when inputting image data is: channels,rows,cols, otherwise:rows,cols,channels
    # This is the part of the code that reorganizes the training dataset according to the order of dimensions required by the keras library.
    if K.image_dim_ordering() == 'th':
      train_images = train_images.reshape(train_images.shape[0], img_channels, img_rows, img_cols)
      valid_images = valid_images.reshape(valid_images.shape[0], img_channels, img_rows, img_cols)
      test_images = test_images.reshape(test_images.shape[0], img_channels, img_rows, img_cols)
      self.input_shape = (img_channels, img_rows, img_cols)
    else:
      train_images = train_images.reshape(train_images.shape[0], img_rows, img_cols, img_channels)
      valid_images = valid_images.reshape(valid_images.shape[0], img_rows, img_cols, img_channels)
      test_images = test_images.reshape(test_images.shape[0], img_rows, img_cols, img_channels)
      self.input_shape = (img_rows, img_cols, img_channels)
 
      # of output training sets, validation sets, test sets
      print(train_images.shape[0], 'train samples')
      print(valid_images.shape[0], 'valid samples')
      print(test_images.shape[0], 'test samples')
 
      # Our model uses categorical_crossentropy as the loss function and therefore needs to be partitioned according to the number of classes nb_classes
      # category tags are one-hot coded to make them vectorized, here we have only two categories, after transformation the tagged data becomes two-dimensional
      train_labels = np_utils.to_categorical(train_labels, nb_classes)
      valid_labels = np_utils.to_categorical(valid_labels, nb_classes)
      test_labels = np_utils.to_categorical(test_labels, nb_classes)
 
      # Pixel data floats in order to normalize it
      train_images = train_images.astype('float32')
      valid_images = valid_images.astype('float32')
      test_images = test_images.astype('float32')
 
      # Normalize it, each pixel value of the image is normalized to the interval 0~1
      train_images /= 255
      valid_images /= 255
      test_images /= 255
 
      self.train_images = train_images
      self.valid_images = valid_images
      self.test_images = test_images
      self.train_labels = train_labels
      self.valid_labels = valid_labels
      self.test_labels = test_labels
 
 
# CNN network model classes
class Model:
  def __init__(self):
     = None
 
    # Modeling
 
  def build_model(self, dataset, nb_classes=2):
    # Construct an empty network model, which is a linear stacked model, where the individual neural network layers are added sequentially, professionally known as a sequential model or linear stacked model
     = Sequential()
 
    # The following code will sequentially add the layers needed for the CNN network, an add is a network layer
    (Convolution2D(32, 3, 3, border_mode='same',
                   input_shape=dataset.input_shape)) # 1 2-dimensional convolutional layer
    (Activation('relu')) # 2 Activation function layer
 
    (Convolution2D(32, 3, 3)) # 3 2-dimensional convolutional layers
    (Activation('relu')) # 4 Activation function layer
 
    (MaxPooling2D(pool_size=(2, 2))) # 5 Pooling layer
    (Dropout(0.25)) # 6 Dropout layer
 
    (Convolution2D(64, 3, 3, border_mode='same')) # 7 2-dimensional convolutional layers
    (Activation('relu')) # 8 Activation function layer
 
    (Convolution2D(64, 3, 3)) # 9 2-dimensional convolutional layers
    (Activation('relu')) # 10 Activation function layer
 
    (MaxPooling2D(pool_size=(2, 2))) # 11 Pooling layer
    (Dropout(0.25)) # 12 Dropout layer
 
    (Flatten()) # 13 Flatten layers
    (Dense(512)) # 14 The Dense Layer, also known as the Fully Connected Layer.
    (Activation('relu')) # 15 Activation function layer
    (Dropout(0.5)) # 16 Dropout layer
    (Dense(nb_classes)) # 17 Dense layer
    (Activation('softmax')) # 18 Classification layer, output final results
 
    # Output model profiles
    ()
 
  # Training models
  def train(self, dataset, batch_size=20, nb_epoch=10, data_augmentation=True):
    sgd = SGD(lr=0.01, decay=1e-6,
         momentum=0.9, nesterov=True) # Use SGD+momentum's optimizer for training, first generate an optimizer object
    (loss='categorical_crossentropy',
              optimizer=sgd,
              metrics=['accuracy']) # Actual model configuration work completed
 
    # Without using data boosting, so-called boosting is the use of rotating, flipping, adding noise, etc. from the training data we provide to create new
    # Training data, consciously boosting the size of the training data and increasing the amount of model training
    if not data_augmentation:
      (dataset.train_images,
              dataset.train_labels,
              batch_size=batch_size,
              nb_epoch=nb_epoch,
              validation_data=(dataset.valid_images, dataset.valid_labels),
              shuffle=True)
    # Use of real-time data to enhance
    else:
      # Define a data generator to be used for data lifting, which returns a generator object datagen, which is invoked every
      # Subsequent to its generation of a set of data (sequential generation), saving memory, is actually python's data generator
      datagen = ImageDataGenerator(
        featurewise_center=False, # Whether to decentralize the input data (mean value 0).
        samplewise_center=False, # Whether or not to make each sample mean of the input data 0
        featurewise_std_normalization=False, # Whether data are standardized (input data divided by the standard deviation of the data set)
        samplewise_std_normalization=False, # Whether to divide each sample data by its own standard deviation
        zca_whitening=False, # Whether or not to apply ZCA whitening to the input data
        rotation_range=20, # Angle of random rotation of the image during data lifting (range 0 to 180)
        width_shift_range=0.2, # Amplitude of the horizontal offset of the image during data lifting (in units of a percentage of the image width, a floating point number between 0 and 1)
        height_shift_range=0.2, # Same as above, except here it's vertical
        horizontal_flip=True, # Whether to perform random horizontal flips
        vertical_flip=False) # Whether to perform random vertical flips
 
      # Calculate the number of the entire training sample set for processing such as eigenvalue normalization, ZCA whitening, etc.
      (dataset.train_images)
 
      # Start training the model with the generator
      .fit_generator((dataset.train_images, dataset.train_labels,
                         batch_size=batch_size),
                   samples_per_epoch=dataset.train_images.shape[0],
                   nb_epoch=nb_epoch,
                   validation_data=(dataset.valid_images, dataset.valid_labels))
 
  MODEL_PATH = '.h5'
 
  def save_model(self, file_path=MODEL_PATH):
    (file_path)
 
  def load_model(self, file_path=MODEL_PATH):
     = load_model(file_path)
 
  def evaluate(self, dataset):
    score = (dataset.test_images, dataset.test_labels, verbose=1)
    print("%s: %.2f%%" % (.metrics_names[1], score[1] * 100))
 
  # Recognize faces
  def face_predict(self, image):
    # Still determining the order of dimensions based on the back-end system #
    if K.image_dim_ordering() == 'th' and  != (1, 3, IMAGE_SIZE, IMAGE_SIZE):
      image = resize_image(image) # Size must be consistent with training set should be IMAGE_SIZE x IMAGE_SIZE
      image = ((1, 3, IMAGE_SIZE, IMAGE_SIZE)) # Unlike model training, this time the prediction is only for 1 image
    elif K.image_dim_ordering() == 'tf' and  != (1, IMAGE_SIZE, IMAGE_SIZE, 3):
      image = resize_image(image)
      image = ((1, IMAGE_SIZE, IMAGE_SIZE, 3))
 
    # Floating point and normalized
    image = ('float32')
    image /= 255
 
    # Given the probability that the input belongs to each category, and we are in the binary category, what is the probability that the input image belongs to each of the categories 0 and 1 that this function will give us
    result = .predict_proba(image)
    print('result:', result)
 
    # Given a category prediction: 0 or 1
    result = .predict_classes(image)
 
    # Return category forecast results
    return result[0]
 
 
if __name__ == '__main__': # Training models
  dataset = Dataset('face_image')
  ()
 
  model = Model()
  model.build_model(dataset)
 
  # Code to test the training function
  (dataset)
 
if __name__ == '__main__': # Train and save the model
  dataset = Dataset('face_image')
  ()
 
  model = Model()
  model.build_model(dataset)
  (dataset)
  model.save_model(file_path='model/.h5')
 
if __name__ == '__main__':  # Read the model for evaluation
  dataset = Dataset('face_image')
  ()
 
  # Assessment models
  model = Model()
  model.load_model(file_path='model/.h5')
  (dataset)

5. After training the model, use Opencv to read the video stream in real time to detect the position of the face, and then put the face into the model for prediction

# -*- coding: utf-8 -*-
 
import cv2
import sys
from face_train import Model
 
if __name__ == '__main__':
  if len() != 1:
    print("Usage:%s camera_id\r\n" % ([0]))
    (0)
 
  # Loading models
  model = Model()
  model.load_model(file_path='model/.h5')
 
  # Color of the rectangular border that frames the face
  color = (0, 255, 0)
 
  # Capture a live video stream from a specified camera
  cap = (0)
 
  # Face Recognition Classifier Local Storage Paths
  cascade_path = "haarcascade_frontalface_alt2.xml"
 
  # Loop detection to recognize faces
  while True:
    ret, frame = () # Read a frame of video
 
    if ret is True:
 
      # Image graying to reduce computational complexity
      frame_gray = (frame, cv2.COLOR_BGR2GRAY)
    else:
      continue
    # Use face recognition classifiers, read in classifiers #
    cascade = (cascade_path)
 
    # Use a classifier to identify which area is a face
    faceRects = (frame_gray, scaleFactor=1.2, minNeighbors=3, minSize=(32, 32))
    if len(faceRects) > 0:
      for faceRect in faceRects:
        x, y, w, h = faceRect
 
        # Capture an image of a face and submit it to a model to recognize who it is #
        image = frame[y - 10: y + h + 10, x - 10: x + w + 10]
        faceID = model.face_predict(image)
 
        # If it's "I"
        if faceID == 0:
          (frame, (x - 10, y - 10), (x + w + 10, y + h + 10), color, thickness=2)
 
          # Who's the text cue?
          (frame, 'zhuhaipeng',
                (x + 30, y + 30), # Coordinates
                cv2.FONT_HERSHEY_SIMPLEX, # Fonts
                1, # Font size
                (255, 0, 255), # Color
                2) # Line width of the word
        else: # If it wasn't for me #
          # Text cues unknown
          (frame, 'Unknown people ',
                (x + 30, y + 30), # Coordinates
                cv2.FONT_HERSHEY_SIMPLEX, # Fonts
                1, # Font size
                (255, 0, 255), # Color
                2) # Line width of the word
 
    ("identify me", frame)
 
    # Wait 10 milliseconds to see if there is a keystroke input
    k = (10)
    # Exit loop if q is entered
    if k & 0xFF == ord('q'):
      break
 
  # Release the camera and destroy all windows
  ()
  ()

To this point a simple face recognition small project is completed, in this project, just a simple two-classification, you can expand on this basis for multi-classification, if the accuracy of the recognition is low, you can try to change the network architecture, or preprocessing of data and so on, if you are interested in improving it, you can try.

This is the whole content of this article.