Being a bit of a noob, it was quite rewarding to get my hands on it with facial recognition.
operating environment
- python3.7
- tensorflow 2.2.0
- opencv-python 4.4.0.40
- Keras 2.4.3
- numpy 1.18.5, the specific installation process as well as the environment to build omitted, you can learn from the Internet.
a specific target
Train your own data with a convolutional neural network and be able to successfully recognize yourself
The realization steps are shown below
1. Face data acquisition and reading
1.1 Data acquisition
This dataset uses opencv to open the camera, and collects picture information from the camera as well as pictures of certain people from outside, and collects a total of 5 people's information. (Here, there is no cropping of face information directly from the camera in order to facilitate the processing of pictures collected from the outside, and there are 800 pictures of each person, and this dataset puts the pictures taken of each person in the same folder with the beginning of the name abbreviation).
The relevant codes are listed below:
import os import cv2 import time from PIL import Image # Only screenshots are realized global path path='./images/' #Face sampling, wrapper function def cy(path): #path is the path to save the image # Call the laptop's built-in camera, the parameter is 0, if there are other cameras you can adjust the parameter to 1,2 cap = (0) #Tag an id for the face that's about to be recorded face_id = input('\n User face information entry, enter user name (preferably in English): \n') #sampleNum is used to count the number of samples count = 0 while True: # Reading pictures from the camera success,img = () count += 1 #Save the image and see the grayscale image as a 2D array to detect the face region #Save to the appropriate folder (path+str(face_id)+'.'+str(count)+'.jpg',img) #Display Pictures ('image',img) #The waitkey method allows you to bind a key to keep the screen in and out, and exit the camera with the q key. k = (1) if k == '27': break # or get 800 samples to exit the camera, here you can modify the amount of data according to the actual situation, the actual test after 800 sheets of the effect is more satisfactory elif count >= 500: (2) success,img = () break # Turn off the camera and free up resources () () # Call function for face sampling cy(path)
Get the partial region of the face of each person, here face detection cascade classifier is used and save the image in a specific folder.
import cv2 import os # Processing of images, the input is not gray-scale images, to facilitate the external collection of images for processing CASE_PATH = "haarcascade_frontalface_default.xml" RAW_IMAGE_DIR = 'images/' DATASET_DIR = 'hh/' path='D:\\pythonlx\\test\\images\\' #Face Classifier face_cascade = (CASE_PATH) #Define face size def save_feces(img, name,x, y, width, height): image = img[y:y+height, x:x+width] (name, image) image_list = (RAW_IMAGE_DIR) # List all directories and files in a folder for image_path in range(len(image_list)): gh=path+image_list[image_path] # print(gh) image = (gh) #gray = (image, cv2.COLOR_BGR2GRAY) faces = face_cascade.detectMultiScale(image, scaleFactor=1.2, minNeighbors=5, minSize=(5, 5), ) for (x, y, width, height) in faces: save_feces(image, '%ss%' % (DATASET_DIR, image_path+1), x, y - 30, width, height+30)
1.2 Data reading
The image dataset is converted to a four-dimensional array, normalized, and the labels are vectorized using one-hot coding, randomly divided according to 80% of the training set and 20% of the testing set.
and normalized.
#Read the picture def read_image(): data_x, data_y = [], [] image_list = ('mine/') for i in range(len(image_list)): try: im = ('mine/{}'.format(image_list[i])) im = resize_without_deformation(im) data_x.append((im, dtype = np.int8)) #define tags a=image_list[i].split('.')[0] if a=='s2': data_y.append(0) elif a=='s4': data_y.append(1) elif a=='s5': data_y.append(2) elif a=='s6': data_y.append(3) elif a=='s7': data_y.append(4) except IOError as e: print(e) except: print('Unknown Error!') return data_x,data_y #Read all images as well as tags raw_images, raw_labels = read_image() ##View data per tag #a=raw_labels.count(0)#583 #b=raw_labels.count(1)#621 #c=raw_labels.count(2)#717 #d=raw_labels.count(3)#765 #e=raw_labels.count(4)#698 # to floating point raw_images, raw_labels = (raw_images, dtype = np.float32),(raw_labels, dtype = np.int32) #Converting tags to one_hot type ont_hot_labels = np_utils.to_categorical(raw_labels) #Divide the dataset, 80% training set, 20% test set train_input, valid_input, train_output, valid_output =train_test_split(raw_images, ont_hot_labels, test_size = 0.2) # Data normalization train_input /= 255.0 valid_input /= 255.0
2. Image pre-processing
The shape of the collected picture samples may have irregular size, must do size transformation of the picture, transformed into 100 * 100 size, in order to prevent deformation of the picture, the shorter side of the picture will be blackened to fill.
Make it into the same proportion as the target image, and then resize, so as to retain the face information of the original image, but also to prevent the image deformation; and finally equalize the histogram of the gray-scale image to enhance the details and contrast of the image, and improve the recognition rate.
def resize_without_deformation(image, size = (100, 100)): height, width, _ = #Find the longest edge for edges of unequal lengths longest_edge = max(height, width) # Use 0 to fill the border top, bottom, left, right = 0, 0, 0, 0 #Calculate the short side by adding more pixels to make it equal to the long side. if height < longest_edge: height_diff = longest_edge - height top = int(height_diff // 2) bottom = height_diff - top elif width < longest_edge: width_diff = longest_edge - width left = int(width_diff // 2) right = width_diff - left # Add a border to the image, is the image length, width and so on, cv2.BORDER_CONSTANT specify the border color specified by value image_with_border = (image, top , bottom, left, right, cv2.BORDER_CONSTANT, value = [0, 0, 0]) resized_image = (image_with_border, size) #Convert cropped images to grayscale resize_image= (resized_image,cv2.COLOR_BGR2GRAY) # Histogram equalization hist = (resize_image) img2 = ((100, 100, 1)) return img2
3. Model building and training
According to the different roles of the constituent structures in the convolutional neural network to build a convolutional network model, adjust each parameter to achieve the optimization of the model and improve the model training effect.
Modeling framework:
Model parameters:
Building Convolutional Networks and Training:
Since the images in the dataset may be too homogeneous as well as have small variations, a data boost is added later to train the model using the generator.
#Build Convolutional Neural Networks, Sequential Models model = () # Convolutional layer, convolution kernel size 32, each size 383, step size 1, input of type (100,100,1),1 is channel, activation function relu (Conv2D(filters=32,kernel_size=(3,3),padding='valid',strides= (1, 1),#1 input_shape = (100, 100,1), activation='relu')) (Conv2D(filters=32,kernel_size=(3,3),padding='valid',strides= (1, 1),#2 activation='relu')) # pooling layer (MaxPooling2D(pool_size=(2, 2)))#3 #Dropout layer (Dropout(0.25))#4 # Convolutional layers (Conv2D(64, (3, 3), padding='valid', strides = (1, 1), activation = 'relu'))#5 (Conv2D(64, (3, 3), padding='valid', strides = (1, 1), activation = 'relu'))#6 # pooling layer (MaxPooling2D(pool_size=(2, 2)))#7 (Dropout(0.25))#8 # Full connection (Flatten())#9 (Dense(512, activation = 'relu'))#10 (Dropout(0.25))#11 # Output layer, the number of neurons is the number of labeled species, use the sigmoid activation function, output the final result (Dense(len(ont_hot_labels[0]), activation = 'sigmoid'))#12 # Optimization models, # #SGD ---- gradient descent operator #learning_rate = 1# learning rate #decay = 1e-6# learning rate decay factor #momentum = 0.8 #impulse #nesterov = True #sgd_optimizer = SGD(lr = learning_rate, decay = decay, # momentum = momentum, nesterov = nesterov) #categorical_crossentropy # Optimizer with Adam's algorithm and loss function with cross entropy (optimizer='adam',loss='categorical_crossentropy', metrics=['accuracy']) # Output model parameters () # Define a data generator for data lifting, which returns a generator object datagen, datagen is invoked every # Subsequent to its generation of a set of data (sequential generation), saving memory, is actually python's data generator datagen = ImageDataGenerator( featurewise_center = False, # whether to decentralize the input data (mean value 0). samplewise_center = False, # Whether or not to make each sample mean of the input data 0 featurewise_std_normalization = False, # Is data standardized (input data divided by the standard deviation of the data set) samplewise_std_normalization = False, # Whether to divide each sample data by its own standard deviation zca_whitening = False, # Whether or not to apply ZCA whitening to the input data rotation_range = 20, # Angle of random rotation of the picture during data boosting (range 0 to 180) width_shift_range = 0.2, # Amplitude of the horizontal offset of the image when data is boosted (in units of a percentage of the image width, a floating point number between 0 and 1) height_shift_range = 0.2, # Same as above, except here it's vertical horizontal_flip = True, #Whether to perform random horizontal flips vertical_flip = False) #Whether to perform random vertical flips # Calculate the number of the entire training sample set for processing such as eigenvalue normalization, ZCA whitening, etc. (train_input) #Start training the model with the generator history=model.fit_generator((train_input, train_output,batch_size = 50), epochs = 10, validation_data = (valid_input, valid_output)) #Model validation print((valid_input, valid_output, verbose=2)) ## Draw a graph of LOSS changes (['accuracy']) (['val_accuracy'],label='val_accuracy') ('Epoch') ('Accuracy') ([0.5,1]) (loc='lower right') #Save the model MODEL_PATH = 'face_model.h5' (MODEL_PATH)
4. Identification and verification
The saved model is reloaded and the camera is turned on for recognition, it can recognize itself as well as others very accurately (the prediction is changed to recognizing self and non-self since the identity of others cannot be disclosed easily), when the model prediction is performed, each frame of the image obtained from the camera needs to be processed and converted to the appropriate format, otherwise there will be a problem with the SHAPE.
And the prediction is corresponding to the confidence of each label, return to its maximum value of the column index can know the corresponding label, you can identify who it is.
The rendering is shown below
- Identify yourself with yourself:
- Identify yourself and what is not you:
5 Summary
In the first attempt for model training, there is no detail processing of the image as well as data enhancement, although the model accuracy is very high, but when the camera is used for recognition, there will be recognition inaccuracy, so later on the detail processing of the image as well as data enhancement processing is added, the prediction effect is up to the expected value, and the model achieves a maximum accuracy of 99.4% with a loss rate of 0.015.
The problem of the data set, due to the direct use of the camera to shoot, there will be some angles are not captured completely or will be affected by environmental factors. When comparing to face recognition, the features can only be recognized at those angles, and will not be recognized at a different angle.
In this example, the face localization is directly using the cascade classifier that comes with opencv, which can be found in the installed opencv folder. We did not write our own algorithm.
The above is a personal experience, I hope it can give you a reference, and I hope you can support me more.