SoFunction
Updated on 2025-05-21

Sharing essential tips for Python image processing

Here are 15 basic skills you need to master when processing Python image:

1. Image reading and saving

With the help of OpenCV, Pillow (PIL) or Matplotlib libraries, image files in various formats can be read and saved.

import cv2
from PIL import Image
import  as plt

# OpenCV Reading and Savingimg_cv = ('')  # BGR format('', img_cv)

# Pillow Reading and Savingimg_pil = ('')
img_pil.save('')

# Matplotlib read and displayimg_plt = ('')
(img_plt)

2. Image color space conversion

It can convert between different color spaces such as RGB, BGR, HSV, and grayscale.

# BGR to RGBimg_rgb = (img_cv, cv2.COLOR_BGR2RGB)

# BGR to grayscaleimg_gray = (img_cv, cv2.COLOR_BGR2GRAY)

# RGB to HSVimport numpy as np
hsv_img = (img_cv, cv2.COLOR_BGR2HSV)

3. Image cropping and resizing

You can crop, adjust the size, scale and rotate the image.

#Cropcropped = img_cv[100:300, 200:400]  # Crop [y1:y2, x1:x2]
# Resizeresized = (img_cv, (500, 300))  # Specify width and heightresized = (img_cv, None, fx=0.5, fy=0.5)  # Scale
# Rotaterows, cols = img_cv.shape[:2]
M = cv2.getRotationMatrix2D((cols/2, rows/2), 90, 1)
rotated = (img_cv, M, (cols, rows))

4. Image filtering and smoothing

Various filters can be applied to reduce noise or smooth the image.

# Gaussian blurblur = (img_cv, (5, 5), 0)

# Median filtering (suitable for salt and pepper noise)median = (img_cv, 5)

# Bilateral filtering (reserve edges)bilateral = (img_cv, 9, 75, 75)

5. Edge detection

It can detect edges in images, and common ones include Canny edge detection and Sobel operator.

# Canny edge detectionedges = (img_gray, 100, 200)

# Sobel edge detectionsobelx = (img_gray, cv2.CV_64F, 1, 0, ksize=3)
sobely = (img_gray, cv2.CV_64F, 0, 1, ksize=3)
edges = (sobelx**2 + sobely**2)

6. Threshold processing

By setting the threshold, the image is converted into a binary image.

# Simple thresholdret, thresh = (img_gray, 127, 255, cv2.THRESH_BINARY)

# Adaptive Thresholdthresh = (img_gray, 255, 
                               cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
                               cv2.THRESH_BINARY, 11, 2)

# Otsu Thresholdret, thresh = (img_gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)

7. Morphological operation

Including morphological operations such as expansion, corrosion, open operations and closed operations.

# Define structural elementskernel = ((5,5), np.uint8)

# Corrosionerosion = (img_gray, kernel, iterations=1)

# Expansiondilation = (img_gray, kernel, iterations=1)

# Start the operation (corrosion first and then expansion)opening = (img_gray, cv2.MORPH_OPEN, kernel)

# Closed operation (expand first and then corrosion)closing = (img_gray, cv2.MORPH_CLOSE, kernel)

8. Histogram processing

The histogram of the image can be calculated and displayed, and histogram equalization can be performed to enhance contrast.

# Calculate histogramhist = ([img_gray], [0], None, [256], [0, 256])

# Histogram equalizationequ = (img_gray)

# Adaptive histogram equalizationclahe = (clipLimit=2.0, tileGridSize=(8,8))
cl1 = (img_gray)

9. Feature detection and description

Ability to detect key points in the image and extract feature descriptors such as SIFT, SURF, ORB, etc.

# ORB feature detectionorb = cv2.ORB_create()
keypoints, descriptors = (img_gray, None)

# Draw key pointsimg_kp = (img_gray, keypoints, None, color=(0,255,0), flags=0)

# SIFT feature detection (need to install opencv-contrib-python)sift = cv2.SIFT_create()
keypoints, descriptors = (img_gray, None)

10. Image registration and feature matching

It can match feature points between different images to achieve image alignment.

# Feature Matchingbf = (cv2.NORM_HAMMING, crossCheck=True)
matches = (des1, des2)
matches = sorted(matches, key=lambda x: )

# Homography matrix estimation and image registrationsrc_pts = np.float32([ kp1[].pt for m in matches ]).reshape(-1,1,2)
dst_pts = np.float32([ kp2[].pt for m in matches ]).reshape(-1,1,2)
H, _ = (src_pts, dst_pts, , 5.0)
aligned = (img1, H, ([1], [0]))

11. Contour Detection and Analysis

It can detect the contour in the image and calculate parameters such as the area, perimeter and other aspects of the contour.

# Contour detectioncontours, hierarchy = (thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

# Draw outlinesimg_contours = img_cv.copy()
(img_contours, contours, -1, (0,255,0), 3)

# Contour Analysiscnt = contours[0]
area = (cnt)
perimeter = (cnt, True)

12. Image segmentation

The image can be segmented into different regions, such as using GrabCut or watershed algorithms.

# GrabCut segmentationmask = (img_cv.shape[:2], np.uint8)
bgdModel = ((1,65), np.float64)
fgdModel = ((1,65), np.float64)
rect = (50,50,450,290)  # ROI Area(img_cv, mask, rect, bgdModel, fgdModel, 5, cv2.GC_INIT_WITH_RECT)
mask2 = ((mask==2)|(mask==0),0,1).astype('uint8')
img_seg = img_cv*mask2[:,:,]

# Watershed segmentationret, thresh = (img_gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
kernel = ((3,3), np.uint8)
opening = (thresh, cv2.MORPH_OPEN, kernel, iterations=2)
sure_bg = (opening, kernel, iterations=3)
dist_transform = (opening, cv2.DIST_L2, 5)
ret, sure_fg = (dist_transform, 0.7*dist_transform.max(), 255, 0)
unknown = (sure_bg, sure_fg)

13. Template matching

You can find specific templates in the image.

template = ('', 0)
h, w = [:2]

# Template Matchingres = (img_gray, template, cv2.TM_CCOEFF_NORMED)
min_val, max_val, min_loc, max_loc = (res)

# Get the matching position and draw the rectangletop_left = max_loc
bottom_right = (top_left[0] + w, top_left[1] + h)
(img_cv, top_left, bottom_right, 255, 2)

14. Perspective Transformation and Affine Transformation

Ability to perform perspective correction and affine transformation on the image.

# Vision Transformationpts1 = np.float32([[56,65],[368,52],[28,387],[389,390]])
pts2 = np.float32([[0,0],[300,0],[0,300],[300,300]])
M = (pts1, pts2)
dst = (img_cv, M, (300, 300))

# Affine Transformationpts1 = np.float32([[50,50],[200,50],[50,200]])
pts2 = np.float32([[10,100],[200,50],[100,250]])
M = (pts1, pts2)
dst = (img_cv, M, (cols, rows))

15. Fourier Transformation

Can be used for frequency domain analysis and filtering.

# Fourier Transformationf = .fft2(img_gray)
fshift = (f)
magnitude_spectrum = 20*((fshift))

# Inverse Fourier Transformrows, cols = img_gray.shape
crow, ccol = rows//2, cols//2
fshift[crow-30:crow+30, ccol-30:ccol+30] = 0  # Low Pass Filterf_ishift = (fshift)
img_back = .ifft2(f_ishift)
img_back = (img_back)

The above skills are the basis of Python image processing, and you can expand and combine them according to specific needs.

This is the end of this article about the sharing of essential skills for Python image processing. For more related Python image processing content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!