SoFunction
Updated on 2024-11-19

Python data preprocessing data normalization (normalization) example

This article is an example of Python data preprocessing data normalization. Shared for your reference, as follows:

Data normalization

In order to eliminate the effects of differences in the scale and range of values between indicators, standardization (normalization) is required, whereby the data are scaled so that they fall into a specific area that facilitates comprehensive analysis.

The main methods of data normalization are:

- Minimum-maximum normalization
- Zero-mean normalization

Example of data

code implementation

#-*- coding: utf-8 -*-
# Data normalization
import pandas as pd
import numpy as np
datafile = 'normalization_data.xls' # Parameter initialization
data = pd.read_excel(datafile, header = None) #Read the data
(data - ())/(() - ()) # Min-max normalization
(data - ())/() # Zero-mean normalization

The following output can be seen from the command line:

>>> (())/(()-(
          0         1         2         3
0  0.074380  0.937291  0.923520  1.000000
1  0.619835  0.000000  0.000000  0.850941
2  0.214876  0.119565  0.813322  0.000000
3  0.000000  1.000000  1.000000  0.563676
4  1.000000  0.942308  0.996711  0.804149
5  0.264463  0.838629  0.814967  0.909310
6  0.636364  0.846990  0.786184  0.929571

>>> (())/()
          0         1         2         3
0 -0.905383  0.635863  0.464531  0.798149
1  0.604678 -1.587675 -2.193167  0.369390
2 -0.516428 -1.304030  0.147406 -2.078279
3 -1.111301  0.784628  0.684625 -0.456906
4  1.657146  0.647765  0.675159  0.234796
5 -0.379150  0.401807  0.152139  0.537286
6  0.650438  0.421642  0.069308  0.595564

The above code was changed to useprintstatement prints, as follows:

#-*- coding: utf-8 -*-
# Data normalization
import pandas as pd
import numpy as np
datafile = 'normalization_data.xls' # Parameter initialization
data = pd.read_excel(datafile, header = None) #Read the data
print((data - ())/(() - ())) # Min-max normalization
print((data - ())/()) # Zero-mean normalization

The following printout can be output:

          0         1         2         3
0  0.074380  0.937291  0.923520  1.000000
1  0.619835  0.000000  0.000000  0.850941
2  0.214876  0.119565  0.813322  0.000000
3  0.000000  1.000000  1.000000  0.563676
4  1.000000  0.942308  0.996711  0.804149
5  0.264463  0.838629  0.814967  0.909310
6  0.636364  0.846990  0.786184  0.929571
          0         1         2         3
0 -0.905383  0.635863  0.464531  0.798149
1  0.604678 -1.587675 -2.193167  0.369390
2 -0.516428 -1.304030  0.147406 -2.078279
3 -1.111301  0.784628  0.684625 -0.456906
4  1.657146  0.647765  0.675159  0.234796
5 -0.379150  0.401807  0.152139  0.537286
6  0.650438  0.421642  0.069308  0.595564

Attachment:The code uses thenormalization_data.xlsClick hereDownload

Readers interested in more Python related content can check out this site's topic: theSummary of Python mathematical operations techniques》、《Python Data Structures and Algorithms Tutorial》、《Summary of Python function usage tips》、《Summary of Python string manipulation techniques》、《Python introductory and advanced classic tutorialsand theSummary of Python file and directory manipulation techniques

I hope that what I have said in this article will help you in Python programming.