SoFunction
Updated on 2024-11-10

python Determine if a set of data conforms to a normal distribution

Normal distribution:

If the random variable x follows a normal distribution with mathematical expectation μ and variance σ2 , denoted as N(μ,σ), then it is possible that the random variable x is distributed as N(μ,σ).

where the expected value determines the location of the density function and the standard deviation determines the magnitude of the distribution, the normal distribution when υ = 0, σ = 0 is the standard normal distribution

Judgment methods include drawing graphs/k-s tests

Drawing:

#Import Module
import numpy as np
import pandas as pd
import  as plt
%matplotlib inline

# Construct a set of random data
s = ((1000)+10,columns = ['value'])

#Drawing scatterplots and histograms
fig = (figsize = (10,6))
ax1 = fig.add_subplot(2,1,1) # Subfigure 1 created
(, )
()

ax2 = fig.add_subplot(2,1,2) # Create subfigure 2
(bins=30,alpha = 0.5,ax = ax2)
(kind = 'kde', secondary_y=True,ax = ax2)
()

The results are as follows:

Use the ks test:

# Import scipy modules
from scipy import stats

"""
The kstest method: a KS test with the following parameters: data to be tested, test method (here set to norm normal distribution), mean and standard deviation
The result returns two values: statistic → D value, pvalue → P value
The p-value is greater than 0.05, for normal distribution
H0: the sample meets
H1: the sample does not conform
How p>0.05 accepts H0 ,and vice versa
"""
u = s['value'].mean() # Calculate the mean
std = s['value'].std() # Calculate standard deviation
(s['value'], 'norm', (u, std))

The result is KstestResult(statistic=0.01441344628501079, pvalue=0.9855029319675546), with a p-value greater than 0.05 for a positive too distribution

Above is python to determine whether a set of data in line with the details of the normal distribution, more information about python normal distribution please pay attention to my other related articles!