This article is an example of Python data preprocessing data normalization. Shared for your reference, as follows:
Data normalization
In order to eliminate the effects of differences in the scale and range of values between indicators, standardization (normalization) is required, whereby the data are scaled so that they fall into a specific area that facilitates comprehensive analysis.
The main methods of data normalization are:
- Minimum-maximum normalization
- Zero-mean normalization
Example of data
code implementation
#-*- coding: utf-8 -*- # Data normalization import pandas as pd import numpy as np datafile = 'normalization_data.xls' # Parameter initialization data = pd.read_excel(datafile, header = None) #Read the data (data - ())/(() - ()) # Min-max normalization (data - ())/() # Zero-mean normalization
The following output can be seen from the command line:
>>> (())/(()-(
0 1 2 3
0 0.074380 0.937291 0.923520 1.000000
1 0.619835 0.000000 0.000000 0.850941
2 0.214876 0.119565 0.813322 0.000000
3 0.000000 1.000000 1.000000 0.563676
4 1.000000 0.942308 0.996711 0.804149
5 0.264463 0.838629 0.814967 0.909310
6 0.636364 0.846990 0.786184 0.929571>>> (())/()
0 1 2 3
0 -0.905383 0.635863 0.464531 0.798149
1 0.604678 -1.587675 -2.193167 0.369390
2 -0.516428 -1.304030 0.147406 -2.078279
3 -1.111301 0.784628 0.684625 -0.456906
4 1.657146 0.647765 0.675159 0.234796
5 -0.379150 0.401807 0.152139 0.537286
6 0.650438 0.421642 0.069308 0.595564
The above code was changed to useprint
statement prints, as follows:
#-*- coding: utf-8 -*- # Data normalization import pandas as pd import numpy as np datafile = 'normalization_data.xls' # Parameter initialization data = pd.read_excel(datafile, header = None) #Read the data print((data - ())/(() - ())) # Min-max normalization print((data - ())/()) # Zero-mean normalization
The following printout can be output:
0 1 2 3
0 0.074380 0.937291 0.923520 1.000000
1 0.619835 0.000000 0.000000 0.850941
2 0.214876 0.119565 0.813322 0.000000
3 0.000000 1.000000 1.000000 0.563676
4 1.000000 0.942308 0.996711 0.804149
5 0.264463 0.838629 0.814967 0.909310
6 0.636364 0.846990 0.786184 0.929571
0 1 2 3
0 -0.905383 0.635863 0.464531 0.798149
1 0.604678 -1.587675 -2.193167 0.369390
2 -0.516428 -1.304030 0.147406 -2.078279
3 -1.111301 0.784628 0.684625 -0.456906
4 1.657146 0.647765 0.675159 0.234796
5 -0.379150 0.401807 0.152139 0.537286
6 0.650438 0.421642 0.069308 0.595564
Attachment:The code uses thenormalization_data.xlsClick hereDownload。
Readers interested in more Python related content can check out this site's topic: theSummary of Python mathematical operations techniques》、《Python Data Structures and Algorithms Tutorial》、《Summary of Python function usage tips》、《Summary of Python string manipulation techniques》、《Python introductory and advanced classic tutorialsand theSummary of Python file and directory manipulation techniques》
I hope that what I have said in this article will help you in Python programming.