OpenCV uses KNN to complete OCR handwriting recognition

goal

In this chapter, learn

Using kNN to Build Basic OCR Applications
Using the numeric and alphabetic datasets that come with OpenCV

OCR of handwritten digits

The goal is to build an application that can read handwritten numbers. To do this, sometrain_data cap (a poem)test_data The OpenCV git project has a picture of the (in opencv/samples/data/), which contains 5000 handwritten digits (500 per digit), each of which is of size20x20 The images.

So, the first step is to split this image above into 5000 (500*10) different numbers. For each number, spread it into a row of 400 pixels, this is the training set, i.e. the intensity values of all the pixels. This is the simplest feature set that can be created. Use the first 250 samples of each number as the training settrain_data The 250 samples were then used as a test settest_data 。

import cv2
import numpy as np
img = ('')
gray = (img, cv2.COLOR_BGR2GRAY)
# Now we split the image to 5000 cells, each 20x20 size
cells = [(row, 100) for row in (gray, 50)]
# Make it into a numpy array: its size will be (50, 100, 20, 20)
x = (cells)
# Now we prepare the training data and test data
train = x[:,:50].reshape(-1,400).astype(np.float32) # Size = (2500,400)
test = x[:,50:100].reshape(-1,400).astype(np.float32) # Size = (2500,400)
# Create labels for train and test data
k = (10)
train_labels = (k, 250)[:, ]
test_labels = train_labels.copy()
# Initiate kNN, train it on the training data, then test it with the test data with k=1
knn = .KNearest_create()
(train, .ROW_SAMPLE, train_labels)
ret, result, neighbours, dist = (test, k=5)
# Now we check the accuracy of classification
# For that, compare the result with test_labels and check which are wrong
matches = result==test_labels
correct = np.count_nonzero(matches)
accuracy = correct * 100.0/
print( accuracy )  # 91.76

As can be seen, the above builds a base digital handwriting OCR application ready to go. The accuracy in this particular example is 91.76%.

Improved accuracy methods:

One option to improve accuracy is to add more data for training, especially erroneous data.
The other is to replace the algorithm with a better one

In this paper, this training data is not found every time the application is started, so it is better to save it so that next time you can read this data directly from the file and start the classification. This can be done with the help of some Numpy functions (e.g.，，etc.) to accomplish this operation.

# Save the data
('knn_dight_data.npz', train=train, train_labels=train_labels)
# Now load the data
whit ('knn_data.npz') as data:
    print()
    train = data['train']
    train_labels = data['train_labels']

On windows, it takes about 3.82 MB of memory. Since only intensity values (uint8 data) are used as features, if you need to consider memory issues, you can first convert the data tonp.uint8 The file is then saved. In this case, it only takes up 0.98MB . Then when loading, it can be converted back tofloat32 。

train_uint8 = (np.uint8)
train_labels_uint8 = train_labels.astype(np.uint8)
('knn_dight_data_int8.npz', train=train_uint8, train_labels=train_labels_uint8)

It can also be used to predict individual numbers

# Take an element of the test set
single_data = testData[0].reshape(-1, 400)
single_label = labels[0]
ret, result, neighbours, dist = (data, k=5)
print(result)  # [[0]]
print(label)   # [[0.]]
print(result==label)  # True

OCR of English letters

Next, perform the same operation for the English alphabet, but with a slightly different set of data and features.OpenCV uses the file( /data/samples/data/ ) instead of the image . If you open it, you will see 20,000 rows, which at first glance may look like garbage numbers.

In fact.In each row, the first column is the letter, which is the label。The next 16 numbers are its different features, which were obtained from the UCI Machine Learning Repository. This can be done inthis pageFind detailed information about these features in the

There are 20,000 samples available, the first 10,000 data will be used as training samples and the remaining 10,000 as test samples. Letters should be changed to ASCII characters as they cannot be used directly.

import numpy as np
import cv2
#  Load the data and convert the letters to numbers
data = ('', dtype='float32', delimiter=',', converters={0: lambda ch: ord(ch)-ord('A')})
# Split the dataset in two, with 10000 samples each for training and test sets
train, test = (data, 2)
# Split trainData and testData into features and responses
responses, trainData = (train, [1])
labels, testData = (test, [1])
# Initiate the kNN, classify, measure accuracy
knn = .KNearest_create()
(trainData, .ROW_SAMPLE, responses)
ret, result, neighbours, dist = (testData, k=5)
correct = np.count_nonzero(result==labels)
accuracy = correct * 100 / 
print(accuracy)  # 93.06

It gives me an accuracy of 93.06% . Again, to improve accuracy, iteratively add error data to each category.

Additional resources

/4.5.5/d8/d4…
/ml/datasets…
/wiki/Optica…

The above is OpenCV using KNN to complete the details of OCR handwriting recognition, more information about OpenCV KNN to recognize OCR handwriting please pay attention to my other related articles!