Pandas is an important tool for data analysis in Python, and it provides a rich set of data manipulation methods. In the process of data analysis, you often need to group and aggregate data. In this article, we will introduce the data grouping methods and different aggregation operations in Pandas, and illustrate them with code samples.
Complete Excel data
Read data and group it simply
First, we read the Excel file through Pandas and group it using individual columns and apply the aggregation function. The sample code is as follows:
df1 = pd.read_excel('C:\\\\ Users\\\\liuchunlin2\\\\\ Desktop\\\ data') df = ('Store Name', as_index=False).sum() print(df)
Application of multi-column grouping and aggregation functions
Next, we demonstrated how to use multiple columns for grouping and applying aggregation functions:
df2 = (['Store Name','Order number'], as_index=False).sum() print(df2)
Application of Custom Aggregate Functions
In this example, we define a custom aggregator functioncustom_agg
and apply it to packet aggregation operations:
def custom_agg(x): return () - () result = ('Store Name', as_index=False)['Number of sales'].agg(custom_agg) print(result)
Applying multiple aggregation functions at the same time
We can also apply multiple aggregate functions at the same time, as shown in the following example:
df3 = ('Store Name', as_index=False).agg({'Number of sales': 'sum', 'Sales amount': 'mean'}) print(df3)
Iterative grouping
Pandas supports the operation of iterative grouping, you can see the effect of iterative grouping through the following example:
for group, data in ('Store Name'): print(group) # Key values for the grouping print(data) # All data belonging to the group
conditional filtering
Filter groups based on conditions:
df4 = ('Store Name').filter(lambda x: x['Sales amount'].sum() > 300) print(df4)
Converting Groups and Sorting Groups
Finally, we demonstrated the conversion of grouped data and the operation of group sorting:
df1['NewColumn'] = ('Store Name')['Number of sales'].transform(lambda x:()) print(df1)
arrange in order
df5 = ('Store Name').sum().sort_values('Number of sales', ascending=True) print(df5)
The above is a detailed introduction to Pandas grouping and aggregation operations, through these sample code and explanations, I believe the reader has a more in-depth understanding of the grouping and aggregation operations in Pandas.
Summary: In the data analysis, data grouping and aggregation is a common and important operation , Pandas provides a wealth of features to achieve this purpose , including single-column grouping , multi-column grouping , custom aggregation functions , iterative grouping , data export , conditional filtering , group conversion and group sorting and other operations , to meet most of the data analysis needs.
Full Code
import pandas as pd import numpy as np # Read two Excel files df1 = pd.read_excel('C:\\\\ Users\\\\liuchunlin2\\\\\ Desktop\\\ data') # Use individual columns for grouping and apply aggregation functions df=('Store Name', as_index=False).sum() #df=('Store name', as_index=False).aggregate({'Number of sales': 'sum'}) print(df) # Use multiple columns for grouping and apply aggregation functions: df2=(['Store Name','Order number'], as_index=False).sum() print(df2) # Define custom aggregation functions def custom_agg(x): return () - () # Aggregate 'Column2' with custom aggregator functions result = ('Store Name', as_index=False)['Number of sales'].agg(custom_agg) print(result) # Apply multiple aggregation functions at the same time df3=('Store Name', as_index=False).agg({'Number of sales': 'sum', 'Sales amount': 'mean'}) print(df3) # Iterative grouping for group, data in ('Store Name'): print(group) # Key values for the grouping print(data) # All data belonging to the group df3.to_excel('', index=False) print('This is a data segmentation line') # Filter groups based on conditions df4=('Store Name').filter(lambda x: x['Sales amount'].sum() > 300) print(df4) # Convert groups df1['NewColumn'] = ('Store Name')['Number of sales'].transform(lambda x:()) # Convert 'Column2' within each grouping #df=('Store name', as_index=False)['Number of sales'].transform('sum') print(df1) # Group sorting df5=('Store Name').sum().sort_values('Number of sales', ascending=True) # ascending=True ascending=False descending print(df5)
to this article on the Python Pandas grouping aggregation operation details of the article is introduced to this, more related Pandas grouping aggregation content, please search for my previous posts or continue to browse the following related articles I hope that you will support me in the future more!