refresher
(1) Gradient: The gradient is meant to be a vector (vectors), indicating that the directional derivative of a function at the point is maximized in that direction, i.e., the function at the point changes the fastest and at the greatest rate of change in that direction (the direction of this gradient) (the mode of the gradient).
(2) Linear filtering can be said to be the most basic method of image processing, which allows us to process the image, resulting in a number of different effects
I. Edge Extraction
1, what is the edge
The edge of the image is the part of the image where the local area of the image has significant changes in brightness, and the gray scale profile of this area can generally be viewed as a step, where a gray scale value changes dramatically in a very small buffer area to another gray scale value with a large difference in gray scale.
There are positive and negative edges, just as there are positive and negative values of the derivative:Dark to light is positive, light to dark is negative.
Algorithms for finding edge magnitude: sobel, Roberts, prewitt, Laplacian, Canny operator, Canny operator is better than the others, but it is a bit troublesome to implement.
2、What is edge extraction
Edge extraction, refers to a process in digital image processing for the contours of a picture. For the boundary, the place where the gray value changes more drastically is defined as the edge. Also known as inflection points, inflection points are points where the function undergoes a concavity change. Where the second order derivative is zero. Not first order derivatives, because a first order derivative of zero indicates an extreme point.
Edge Extraction: The basic idea of edge detection is first to use edge enhancement operators to highlight local edges in the image, then define the "edge strength" of the pixel, and extract the set of edge points by setting a threshold value. Due to the presence of noise and blurring, the monitored boundaries may be widened or interrupted at certain points. Therefore, boundary detection consists of two basic elements:
(1) The set of edge points reflecting gray scale changes is extracted using the edge operator.
(2) Eliminate certain boundary points or fill in boundary discontinuities in the set of edge points and connect these edges into a complete line.
Edge Definition: The place where the rate of change of the image gray level is the greatest (the place where the image gray value changes the most drastically). Edges caused by discontinuities in the surface normal variation of image grayscale. It is generally accepted that edge extraction is to preserve the region of the image where the gray scale changes drastically, this is mathematically the most intuitive method is differential (for digital images is differential), in terms of signal processing, it can also be said to be a high-pass filter, that is, to retain high-frequency signals.
Edge information consists of two aspects:
1. Coordinates of pixels
2. Orientation of the edges
(1) Edge detection
Edge Detection Measurement, detection and localization of gray-scale changes in an image.. The next two images show the effect of edge detection.
Applications of edge detection: semantic segmentation and instance segmentation
Semantic Segmentation:
Instance segmentation:
(2) High-frequency signals & low-frequency signals
The low-frequency and high-frequency signals in an image are also called the low-frequency component and the high-frequency component.
In simpler terms, the high-frequency component of an image refers to the places where the intensity (luminance/grayscale) of the image varies dramatically, i.e., the edges (contours);
The low-frequency component of an image refers to where the intensity (luminance/grayscale) shift of the image is flat, i.e., where large blocks of color are located.
The human eye is more sensitive to high frequency signals in images.
(3) Principles and steps of edge detection
1) Filtering: algorithms for edge detection are mainly based on the first and second order derivatives of the image intensities, but the derivatives are usually sensitive to noise, so filters must be used to improve the performance of the edge detector with respect to noise. Common filtering methods are mainly Gaussian filtering.
2) Enhancement: enhancement of edges is based on determining the value of change in the intensity of the neighborhood of each point of the image. Enhancement algorithms can highlight the points where there is a significant change in the neighborhood intensity values of the image grayscale points. In the specific programming implementation, the gradient amplitude can be determined by calculating the gradient amplitude.
3) Detection: After the enhanced image, there are often many points in the neighborhood with large gradient values, and in a particular application, these points are not the edge points we are looking for, so some method should be used to round off these points. In practical engineering, the common method is to detect them by thresholding method.
Detection Principle: The basis on edge detection comes from the fact that theI.e., at the edges, the pixel values "jump" or change significantly. If the first order derivatives are taken at this edge section, extreme values are seen to occur.And where the first order derivatives are extreme, the second order derivatives are 0. Based on this principle, edge detection is possible.
Take the first order derivatives at the edge section and you will see the extreme values appear:
What happens if you take the second order derivative at the edge section?
From the above example we can infer that detecting edges can be found by locating phase elements whose gradient values are larger than their neighbors (or generalized to be larger than a threshold value) . From the above analysis we infer that second order derivatives can be used to detect edges . Since the image is "2-dimensional", we need to find the derivative in both directions.
(4) Image Sharpening
Image sharpening is the process ofCompensates for the contours of the image, enhances the edges of the image and the parts of the image where the grayscale jumps, and makes the image clear.
Image sharpening is used to highlight the edges, contours, or features of certain linear target elements of an object on an image.This filtering method improves the contrast between the edges of the feature and the surrounding image elements, hence the term edge enhancement.
Image sharpening often uses the Laplace transform kernel function:
The template on the right side of the figure it calculates the second order derivative based on the values of the pixels in the four 90-degree directions, up, down, left and right. If you want to include pixels on the diagonal to calculate it, you can use the template on the left.
(5) Image Smoothing
Image smoothing is theFor highlighting wide areas, low-frequency components, backbone portions of an image or suppressing image noise and interfering high-frequency componentsThe purpose of the image processing method is to make the imageSmooth gradient in brightness to reduce abrupt gradientsThe image quality will be improved.
If you use Gx to convolve the graph below, you will get larger values at the center black and white boundary.
II. Sobel operator
The Sobel operator is a typical edge detection operator based on first-order derivatives, due to the introduction of operations similar to local averaging in this operator, it has the ability to be used for noisesmoothing effect, which eliminates the effect of noise very well.The Sobel operatorThe effect on the position of the pixel is weighted so that it works better compared to the Prewitt operator.
The Sobel operator consists of two 3x3 matrices, respectively, for the horizontal and vertical templates, which are plane-convoluted with the image to derive the approximation of the luminance difference in the horizontal and vertical directions, respectively. In practice, the following two templates are commonly used to detect image edges
The disadvantage is that the Sobel operator does not strictly distinguish the subject of the image from the background, in other words the Sobel operator does not process the image based on the image grayscale, and since the Sobel operator does not strictly simulate the physiological features of human vision, the extracted image contours are sometimes unsatisfactory.
Noise and edges are both places where the gray scale changes drastically, so they are essentially the same, and filtering can't be overdone or the edges are detracted
Prewitt operator is a first order differential operator for edge detection, using the gray level difference between the upper and lower, left and right neighbors of a pixel point, to reach the extreme value at the edge to detect the edge, remove some of the pseudo edges, and have a smoothing effect on the noise . The principle is done in the image space using two directional templates with the image for neighborhood convolution, these two directional templates a detection of horizontal edges, a detection of vertical edges. The principle is the same as the sobel operator
III.Canny Edge Detection Algorithm
Canny is currently the best edge detection algorithm ( deep learning is not included in traditional AI ), and its goal is to find an optimal edge, which is defined as:
1, good detection: the algorithm is able to label the actual edges in the image as much as possible
2, good positioning: mark out the edge to be as close as possible to the edge in the actual image
3、Minimum response: the edge in the image can only be marked once
1. Algorithmic steps
1. Graying of images
2. Perform Gaussian filtering on the image:
The gray values of the pixel points to be filtered and their neighboring points are weighted and averaged according to certain parametric rules. In this way
High-frequency noise superimposed on an ideal image can be effectively filtered out.
3. Detect horizontal, vertical and diagonal edges in an image (e.g. Prewitt, Sobel operator, etc.).
4. Non-maximum suppression of gradient magnitude
5. Detecting and connecting edges with a double-thresholding algorithm
2、Gaussian smoothing
Gaussian smoothing presents a Gaussian distribution horizontally and vertically, theMore emphasis on the weight of the center point after pixel smoothing, which has a better smoothing effect compared to mean filtering.
The important thing is the need to understand thatThe choice of Gaussian convolution kernel size will affect the performance of the Canny detector: the larger the size, the less sensitive the detector will be to noise, but the localization error for edge detection will also increase slightly. Generally 5x5 is a better trade off
3、Nonmaximum value suppression
Non-maximal suppression, referred to as the NMS algorithmIn English, it is Non-Maximum Suppression.The idea is to search for local maxima and suppress non-maxima.The specific implementation of the NMS algorithm is not quite the same in different applications, but the idea is the same.
Why do we need to use non-great value suppression? Take target detection as an example: the process of target detection will produce a large number of candidate boxes at the same target location, these candidate boxes may overlap with each other, at this time we need to use non-great-value suppression to find the best target bounding box and eliminate redundant bounding boxes.
For overlapping candidate boxes, calculate their overlap and delete them if it is greater than the specified threshold; keep them if it is lower than the threshold. For non-overlapping candidate boxes, they are kept.
Non-great value suppression: In common sense, it means to find the local maximum value of a pixel point, and set the gray value corresponding to the non-great value point to 0, so that a large portion of non-edge points can be eliminated.
1) Compare the gradient strength of the current pixel with two pixels along the positive and negative gradient directions.
2) If the current pixel has the largest gradient strength compared to the other two pixels, the pixel point is retained as an edge point, otherwise the pixel point is suppressed (gray value is set to 0).
dTmp1 and dTmp2 are two virtual pixels, also referred to as subpixel points, which are derived from the pixel points by a bilinear interpolation algorithm.
4、Dual threshold detection
Detection with double threshold algorithm (lag threshold): After completing the non-extremely large value suppression, a binary image is obtained, with all non-edge points having a gray value of 0. Local extremely large gray points that may be edges can be set to have a gray scale of 128 (or other). Such a detection result still contains many false edges caused by noise and other reasons. Therefore further processing is needed.
- If the gradient value of an edge pixel is higher than the high threshold, it is labeled asStrong edge pixels;
- If the gradient value of an edge pixel is less than the high threshold and greater than the low threshold, it will be labeled asWeak edge pixels;
- If the gradient value of the edge pixel is less than the low threshold, thewill be inhibited。
Greater than a high threshold is a strong edge, less than a low threshold is not an edge. In between are weak edges. The choice of threshold depends on the content of the given input image, and can not be specified to determine the value, only to debug to get better results of the threshold.
Suppression of isolated low-threshold points:So far, pixel points classified as strong edges have been identified as edges because they are extracted from real edges in the image. However.There will be some debate about weak edge pixels because these pixels can be extracted from real edges or can be caused by noise or color changes.
Weak edges caused by the latter should be suppressed in order to obtain accurate results:
- Typically, weak edge pixels caused by true edges will be connected to strong edge pixels, while the noise response is unconnected.
- In order to track the edge connections, by looking at the weak edge pixel and its 8 neighboring pixels, as long as one of them is a strong edge pixel, the weak edge point can be retained as a true edge.
IV. Relevant codes
1、canny_detail.py
import numpy as np import as plt import math if __name__ == '__main__': pic_path = '' img = (pic_path) if pic_path[-4:] == '.png': # .png images are stored here as floating point numbers from 0 to 1, so they have to be scaled to 255 before being calculated img = img * 255 # It's still a floating-point number img = (axis=-1) # Taking the mean is grayed out # 1, Gaussian smoothing #sigma = 1.52 # Gaussian kernel parameter for Gaussian smoothing, standard deviation, adjustable sigma = 0.5 # Gaussian kernel parameters for Gaussian smoothing, standard deviation, adjustable dim = int((6 * sigma + 1)) # round is a rounding function that finds how many times the Gaussian kernel is based on the standard deviation, i.e. the dimension if dim % 2 == 0: # Preferably an odd number, plus one if not # dim += 1 Gaussian_filter = ([dim, dim]) # Store the Gaussian kernel, it's an array not a list anymore # tmp = [i-dim//2 for i in range(dim)] # Generate a sequence n1 = 1/(2**sigma**2) # Compute the Gaussian kernel n2 = -1/(2*sigma**2) for i in range(dim): for j in range(dim): Gaussian_filter[i, j] = n1*(n2*(tmp[i]**2+tmp[j]**2)) Gaussian_filter = Gaussian_filter / Gaussian_filter.sum() dx, dy = img_new = () # Store the smoothed image, the zeros function gets the floating point data. tmp = dim//2 img_pad = (img, ((tmp, tmp), (tmp, tmp)), 'constant') # Edge filling for i in range(dx): for j in range(dy): img_new[i, j] = (img_pad[i:i+dim, j:j+dim]*Gaussian_filter) (1) (img_new.astype(np.uint8), cmap='gray') # At this point img_new is 255 floating point data, force type conversion to be ok, gray grayscale ('off') # 2. Finding the gradient. The following two are sobel matrices for filtering to find the gradient (detecting horizontal, vertical and diagonal edges in an image) sobel_kernel_x = ([[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]]) sobel_kernel_y = ([[1, 2, 1], [0, 0, 0], [-1, -2, -1]]) img_tidu_x = (img_new.shape) # Store gradient images img_tidu_y = ([dx, dy]) img_tidu = (img_new.shape) img_pad = (img_new, ((1, 1), (1, 1)), 'constant') # Edge filling, based on the matrix structure above so write 1 for i in range(dx): for j in range(dy): img_tidu_x[i, j] = (img_pad[i:i+3, j:j+3]*sobel_kernel_x) # x-direction img_tidu_y[i, j] = (img_pad[i:i+3, j:j+3]*sobel_kernel_y) # y-direction img_tidu[i, j] = (img_tidu_x[i, j]**2 + img_tidu_y[i, j]**2) img_tidu_x[img_tidu_x == 0] = 0.00000001 angle = img_tidu_y/img_tidu_x (2) (img_tidu.astype(np.uint8), cmap='gray') ('off') # 3, non-extreme value suppression img_yizhi = (img_tidu.shape) for i in range(1, dx-1): for j in range(1, dy-1): flag = True # Mark if you want to erase in the 8-neighborhood # temp = img_tidu[i-1:i+2, j-1:j+2] # 8-neighborhood matrix of gradient magnitude if angle[i, j] <= -1: # Use linear interpolation to determine suppression or not num_1 = (temp[0, 1] - temp[0, 0]) / angle[i, j] + temp[0, 1] num_2 = (temp[2, 1] - temp[2, 2]) / angle[i, j] + temp[2, 1] if not (img_tidu[i, j] > num_1 and img_tidu[i, j] > num_2): flag = False elif angle[i, j] >= 1: num_1 = (temp[0, 2] - temp[0, 1]) / angle[i, j] + temp[0, 1] num_2 = (temp[2, 0] - temp[2, 1]) / angle[i, j] + temp[2, 1] if not (img_tidu[i, j] > num_1 and img_tidu[i, j] > num_2): flag = False elif angle[i, j] > 0: num_1 = (temp[0, 2] - temp[1, 2]) * angle[i, j] + temp[1, 2] num_2 = (temp[2, 0] - temp[1, 0]) * angle[i, j] + temp[1, 0] if not (img_tidu[i, j] > num_1 and img_tidu[i, j] > num_2): flag = False elif angle[i, j] < 0: num_1 = (temp[1, 0] - temp[0, 0]) * angle[i, j] + temp[1, 0] num_2 = (temp[1, 2] - temp[2, 2]) * angle[i, j] + temp[1, 2] if not (img_tidu[i, j] > num_1 and img_tidu[i, j] > num_2): flag = False if flag: img_yizhi[i, j] = img_tidu[i, j] (3) (img_yizhi.astype(np.uint8), cmap='gray') ('off') # 4, double threshold detection, connect edges. Iterate over all the points that must be edges, check if there are points in the 8-neighborhood that may be edges, into the stack lower_boundary = img_tidu.mean() * 0.5 high_boundary = lower_boundary * 3 # Here I've set the high threshold to be three times the low threshold # zhan = [] for i in range(1, img_yizhi.shape[0]-1): # The outer rim is off the table for j in range(1, img_yizhi.shape[1]-1): if img_yizhi[i, j] >= high_boundary: # Take, must be the point of the edge img_yizhi[i, j] = 255 ([i, j]) elif img_yizhi[i, j] <= lower_boundary: # give up img_yizhi[i, j] = 0 while not len(zhan) == 0: temp_1, temp_2 = () # Out of the stack a = img_yizhi[temp_1-1:temp_1+2, temp_2-1:temp_2+2] if (a[0, 0] < high_boundary) and (a[0, 0] > lower_boundary): img_yizhi[temp_1-1, temp_2-1] = 255 # This pixel is labeled as an edge ([temp_1-1, temp_2-1]) # Into the stack if (a[0, 1] < high_boundary) and (a[0, 1] > lower_boundary): img_yizhi[temp_1 - 1, temp_2] = 255 ([temp_1 - 1, temp_2]) if (a[0, 2] < high_boundary) and (a[0, 2] > lower_boundary): img_yizhi[temp_1 - 1, temp_2 + 1] = 255 ([temp_1 - 1, temp_2 + 1]) if (a[1, 0] < high_boundary) and (a[1, 0] > lower_boundary): img_yizhi[temp_1, temp_2 - 1] = 255 ([temp_1, temp_2 - 1]) if (a[1, 2] < high_boundary) and (a[1, 2] > lower_boundary): img_yizhi[temp_1, temp_2 + 1] = 255 ([temp_1, temp_2 + 1]) if (a[2, 0] < high_boundary) and (a[2, 0] > lower_boundary): img_yizhi[temp_1 + 1, temp_2 - 1] = 255 ([temp_1 + 1, temp_2 - 1]) if (a[2, 1] < high_boundary) and (a[2, 1] > lower_boundary): img_yizhi[temp_1 + 1, temp_2] = 255 ([temp_1 + 1, temp_2]) if (a[2, 2] < high_boundary) and (a[2, 2] > lower_boundary): img_yizhi[temp_1 + 1, temp_2 + 1] = 255 ([temp_1 + 1, temp_2 + 1]) for i in range(img_yizhi.shape[0]): for j in range(img_yizhi.shape[1]): if img_yizhi[i, j] != 0 and img_yizhi[i, j] != 255: img_yizhi[i, j] = 0 # Mapping (4) (img_yizhi.astype(np.uint8), cmap='gray') ('off') # Close the coordinate scale value ()
2, realized with the help of opencv library:
#!/usr/bin/env python # encoding=gbk import cv2 import numpy as np ''' (image, threshold1, threshold2[, edges[, apertureSize[, L2gradient ]])]) Required parameters: The first parameter is the original image to be processed, which must be a single-channel grayscale image; The second parameter is the lag threshold 1; The third parameter is the lag threshold 2. ''' img = ("", 1) gray = (img, cv2.COLOR_BGR2GRAY) ("canny", (gray, 200, 300)) () ()
3. Optimized program: canny_track.py
#!/usr/bin/env python # encoding=gbk ''' Canny edge detection: an optimized procedure ''' import cv2 import numpy as np def CannyThreshold(lowThreshold): detected_edges = (gray,(3,3),0) # Gaussian filtering detected_edges = (detected_edges, lowThreshold, lowThreshold*ratio, apertureSize = kernel_size) #Edge detection # just add some colours to edges from original image. dst = cv2.bitwise_and(img,img,mask = detected_edges) # Add to detected edges with original color ('canny demo',dst) lowThreshold = 0 max_lowThreshold = 100 ratio = 3 kernel_size = 3 img = ('') gray = (img,cv2.COLOR_BGR2GRAY) # Convert color images to grayscale ('canny demo') #Set the adjustment bar. ''' Here is the second function, (). There are five parameters, in fact, these five parameters look at the name of the variable you can probably know what it means The first parameter is the name of the trackbar object. The second parameter is the name of the panel where the trackbar object is located. The third parameter is the default value of the trackbar, which is also the object to be adjusted. The fourth parameter is the range (0~count) of the trackbar. The fifth parameter is the name of the callback function to be called when adjusting the trackbar. ''' ('Min threshold','canny demo',lowThreshold, max_lowThreshold, CannyThreshold) CannyThreshold(0) # initialization if (0) == 27: #wait for ESC key to exit cv2 ()
To this article on Python to achieve edge extraction of the sample code is introduced to this article, more related to Python edge extraction content, please search for my previous articles or continue to browse the following related articles I hope you will support me in the future more!