Python Pandas Aggregate Functions
In the previous section, we focused on the window function. We know that the window function can be used with the aggregation function, the aggregation function refers to a set of data for the sum, the maximum value, the minimum value, as well as the average value of the operation, this section focuses on the application of the aggregation function.
Applying Aggregate Functions
First let's create a DataFrame object and then apply it to the aggregation function.
import pandas as pd import numpy as np df = ((5, 4),index = pd.date_range('12/14/2020', periods=5),columns = ['A', 'B', 'C', 'D']) print (df) #Window size is 3, min_periods min_observations is 1 r = (window=3,min_periods=1) print(r)
Output results:
A B C D
2020-12-14 0.941621 1.205489 0.473771 -0.348169
2020-12-15 -0.276954 0.076387 0.104194 1.537357
2020-12-16 0.582515 0.481999 -0.652332 -1.893678
2020-12-17 -0.286432 0.923514 0.285255 -0.739378
2020-12-18 2.063422 -0.465873 -0.946809 1.590234Rolling [window=3,min_periods=1,center=False,axis=0]
1) For overall aggregation
You can pass an aggregate function to a DataFrame, as shown in the following example:
import pandas as pd import numpy as np df = ((5, 4),index = pd.date_range('12/14/2020', periods=5),columns = ['A', 'B', 'C', 'D']) print (df) #Window size is 3, min_periods min_observations is 1 r = (window=3,min_periods=1) # Use the aggregate() aggregation operation. print(())
Output results:
A B C D
2020-12-14 0.133713 0.746781 0.499385 0.589799
2020-12-15 -0.777572 0.531269 0.600577 -0.393623
2020-12-16 0.408115 -0.874079 0.584320 0.507580
2020-12-17 -1.033055 -1.185399 -0.546567 2.094643
2020-12-18 0.469394 -1.110549 -0.856245 0.260827A B C D
2020-12-14 0.133713 0.746781 0.499385 0.589799
2020-12-15 -0.643859 1.278050 1.099962 0.196176
2020-12-16 -0.235744 0.403971 1.684281 0.703756
2020-12-17 -1.402513 -1.528209 0.638330 2.208601
2020-12-18 -0.155546 -3.170027 -0.818492 2.863051
2) Aggregate any column
import pandas as pd import numpy as np df = ((5, 4),index = pd.date_range('12/14/2020', periods=5),columns = ['A', 'B', 'C', 'D']) #Window size is 3, min_periods min_observations is 1 r = (window=3,min_periods=1) #Aggregate column A print(r['A'].aggregate())
Output results:
2020-12-14 1.051501
2020-12-15 1.354574
2020-12-16 0.896335
2020-12-17 0.508470
2020-12-18 2.333732
Freq: D, Name: A, dtype: float64
3) Aggregate data from multiple columns
import pandas as pd import numpy as np df = ((5, 4),index = pd.date_range('12/14/2020', periods=5),columns = ['A', 'B', 'C', 'D']) #Window size is 3, min_periods min_observations is 1 r = (window=3,min_periods=1) # Aggregate both A/B columns print(r['A','B'].aggregate())
Output results:
A B
2020-12-14 0.639867 -0.229990
2020-12-15 0.352028 0.257918
2020-12-16 0.637845 2.643628
2020-12-17 0.432715 2.428604
2020-12-18 -1.575766 0.969600
4) Apply multiple functions to a single column
import pandas as pd import numpy as np df = ((5, 4),index = pd.date_range('12/14/2020', periods=5),columns = ['A', 'B', 'C', 'D']) #Window size is 3, min_periods min_observations is 1 r = (window=3,min_periods=1) # Aggregate both A/B columns print(r['A','B'].aggregate([,]))
Output results:
sum mean
2020-12-14 -0.469643 -0.469643
2020-12-15 -0.626856 -0.313428
2020-12-16 -1.820226 -0.606742
2020-12-17 -2.007323 -0.669108
2020-12-18 -0.595736 -0.198579
5) Apply multiple functions to different columns
import pandas as pd import numpy as np df = ((5, 4), index = pd.date_range('12/11/2020', periods=5), columns = ['A', 'B', 'C', 'D']) r = (window=3,min_periods=1) print( r['A','B'].aggregate([,]))
Output results:
A B
sum mean sum mean
2020-12-14 -1.428882 -1.428882 -0.417241 -0.417241
2020-12-15 -1.315151 -0.657576 -1.580616 -0.790308
2020-12-16 -2.093907 -0.697969 -2.260181 -0.753394
2020-12-17 -1.324490 -0.441497 -1.578467 -0.526156
2020-12-18 -2.400948 -0.800316 -0.452740 -0.150913
6) Apply different functions to different columns
import pandas as pd import numpy as np df = ((3, 4), index = pd.date_range('12/14/2020', periods=3), columns = ['A', 'B', 'C', 'D']) r = (window=3,min_periods=1) print(({'A': ,'B': }))
Output results:
A B
2020-12-14 0.503535 -1.301423
2020-12-15 0.170056 -0.550289
2020-12-16 -0.086081 -0.140532
summarize
To this article on the Python Pandas aggregation function is introduced to this article, more related Python Pandas aggregation function content, please search for my previous articles or continue to browse the following related articles I hope you will support me in the future more!