python data processing Classifying images based on color

As mentioned in a previous post, the use of scrapy to crawl images is to collect data in order to categorize the image data.

This post is to use the last crawled image data to do a simple classification process based on the color characteristics of the image.

The realization steps are as follows:

1: Image path add

2: Contrast processing

3: Filtering

4: Data extraction and feature vectorization

5: Picture categorization and processing

6: According to the results of the processing of the picture will be categorized and saved

The amount of code is medium, it could be less, it's just that I have encapsulated each step into a separate class in order to practice the use of classes, of course, there are class inheritance problems in it, and the problems encountered are explained in a previous article. The content may be a bit cumbersome, especially the use of files and paths (you can modify your own), have tried to optimize the code.

The raw data crawled is below:

python根据颜色对图片进行分类

Straight to the code:

import os
import numpy as np
import skimage
import  as plt
from skimage import io 				#Read the picture
from skimage import exposure		# Call the methods rescale_intensity, equalize_hist to adjust the contrast.
from  import gaussian	#Gauss
from skimage import img_as_float  #image unit8 type to float
from  import kmeans,vq,whiten  # Clustering algorithm
import shutil	# Folder contents deleted
 
class Path(object):
	def __init__(self):
		 = r"D:\PYscrapy\get_lixiaoran\picture"
		 = []	#List of original images
		 = 0
 
	def append(self):					#Load the path of each image into the list
		much = ()
		for i in range(len(much)):
			repath = (,str()+'.jpg')
			 +=1
			(repath)
		return 
 
class Contrast(object):
	def __init__(self,pathlist):
		 = pathlist
		 = []	# List of images after changing contrast
		self.path2 = r"D:\PYscrapy\get_lixiaoran\picture2"
		self.page2 = 0
 
	def balance(self):			# Process each image for contrast in two ways 1: Equalize 2: Take extreme values starting from a certain value
		if (self.path2) == False:
			(self.path2)
 
		# for lis in :
		# 	data = (lis)
		# equalized = exposure.equalize_hist(data) # method one here uses a personal, artificially better equalization to handle contrast
		# 	(equalized)
 
		for lis in :
			data = (lis)
			high_contrast = exposure.rescale_intensity(data,in_range=(20,220))	#Method 2: Take the poles at 20 and 220.
			(high_contrast)
 
		for img in :
			repath = (self.path2,str(self.page2)+'.jpg')		#Save the modified image
			(repath,img)
			self.page2 +=1
 
class Filter(Contrast):
	def __init__(self,pathlist):
		super().__init__(pathlist)
		self.path31 = self.path2
		self.path32 = r"D:\PYscrapy\get_lixiaoran\picture3"
		self.page3 = 0
		 = []
 
	def filte_r(self):
		img = (self.path31)	#Read the contents of the file
		if (self.path32) == False:
			(self.path32)
		for lis in range(len(img)):			# Loop to do Gaussian filtering on each image
			path = (self.path31,str(lis)+r'.jpg')
			img = (path)
			gas = gaussian(img,sigma=3)		#multichannel=False Remove color 2D
			(gas)
			path_gas = (self.path32,str(self.page3)+r'.jpg')
			(path_gas,gas)
			self.page3 +=1
		return self.path32
 
class Vectoring(object):
	def __init__(self,filter_path):
		self.path41 = filter_path
		 = []
		 = []
 
	def vector(self):
		numbers = (self.path41)	# Get folder contents
		(self.path41)		#Switching paths
		for i in range(len(numbers)):
			([])
			for j in range(4):
				[i].append([])		#diff[[number],[img_float],[bin_centers],[hist]]
 
		for cnt,number in enumerate(numbers):
			img_float = img_as_float((number))		#will image ndarry nint8->float
			hist,bin_centers = (img_float,nbins=10)	#Take the pixel values of each interval of the image, separating the intervals.
			[cnt][0] = number
			[cnt][1] = img_float
			[cnt][2] = bin_centers	#Add data to diff
			[cnt][3] = hist
 
		for i,j in enumerate():		# Use hist and bin_centers multiplication for dimensionality reduction and vectorization.
			([y*[i][3][x] for x,y in enumerate([i][2])])	# Here's something that may need to be understood, is that there are a bit too many parameters involved
		for i in range(len()):
			[i].append([i])	# Add the eigenvector calculate to diff as well
 
		return  			#diff[[number],[img_float],[bin_centers],[hist],[calculate]]
 
class Modeling(Vectoring):
	def __init__(self,filter_path,K):
		super().__init__(filter_path)
		 = K
 
	def model(self):
		diff = ()
		calculate = []
		for i in range(len(diff)):
			(diff[i][4])
		spot = whiten(calculate)			# Here the k-means method of scipy is used to classify the images
		center,_ = kmeans(spot,)		# If you're not familiar with k-means for scipy, there's a special section on it up front
		cluster,_ = vq(spot,center)
		return diff,cluster 	# Obtaining predicted values
		
class Predicting(object):
	def __init__(self,predicted_diff,predicted_cluster,K):
		 = predicted_diff
		 = predicted_cluster
		self.path42 = r'D:\PYscrapy\get_lixiaoran\picture4'
		 = K
 
	def predicted(self):
		if (self.path42) == True:
			much = (self.path42)
			(self.path42)
		else:
			(self.path42)
		(self.path42)
		for i in range():			#K folders created
			('classify{}'.format(i))
		for i,j in enumerate():
			('classify{}\\{}'.format(j,[i][0]),[i][1])	# Save images to their corresponding folders based on their classification
 
if __name__=="__main__":
	(10)
	# file path add
	start = Path()
	pathlist = ()
 
	#Contrast class
	second = Contrast(pathlist)
	()	#get the number of images after changing contrast
 
	# Gaussian filtration
	filte = Filter(pathlist)
	filter_path = filte.filte_r()
 
	#Data extraction and vectorization
	vectoring = Vectoring(filter_path)
 
	#K value customization
	K = 3
 
	#Modeling
	modeling = Modeling(filter_path,K)
	predicted_diff,predicted_cluster = ()
 
	#Predictions
	predicted = Predicting(predicted_diff,predicted_cluster,K)
	()

The documents are listed below:

python根据颜色对图片进行分类

(K=3) are categorized as follows (picrure4):

python根据颜色对图片进行分类

The white ones are basically in a class

python根据颜色对图片进行分类

Black basic class

The blurriness of the sorted images is because, I sorted the processed images, not the original ones.

Actually, the effect is still there when you look closely, it's just that it's really not too obvious, and the content of the image is still a bit complex. The general framework is already there, it is just a matter of optimization, adjusting the optimization, as well as the processing of vector eigenfunctionalization, you can get better results. Or use some better processing, I'm simply using a few common image processing methods here, so the effect is general.

There are a bit too many classes here, from top to bottom is the order of the classes, so it's still not complicated to look at it step by step. If you have any good suggestions you can share them.

Above this python data processing Classification of images according to color is all that I have shared with you, I hope to give you a reference, and I hope you support me more.