SoFunction
Updated on 2024-11-18

Grouping Aggregation Operations in Python Pandas Explained

Pandas is an important tool for data analysis in Python, and it provides a rich set of data manipulation methods. In the process of data analysis, you often need to group and aggregate data. In this article, we will introduce the data grouping methods and different aggregation operations in Pandas, and illustrate them with code samples.

Complete Excel data

Read data and group it simply

First, we read the Excel file through Pandas and group it using individual columns and apply the aggregation function. The sample code is as follows:

df1 = pd.read_excel('C:\\\\ Users\\\\liuchunlin2\\\\\ Desktop\\\ data')
df = ('Store Name', as_index=False).sum()
print(df)

Application of multi-column grouping and aggregation functions

Next, we demonstrated how to use multiple columns for grouping and applying aggregation functions:

df2 = (['Store Name','Order number'], as_index=False).sum()
print(df2)

Application of Custom Aggregate Functions

In this example, we define a custom aggregator functioncustom_aggand apply it to packet aggregation operations:

def custom_agg(x):
    return () - ()

result = ('Store Name', as_index=False)['Number of sales'].agg(custom_agg)
print(result)

Applying multiple aggregation functions at the same time

We can also apply multiple aggregate functions at the same time, as shown in the following example:

df3 = ('Store Name', as_index=False).agg({'Number of sales': 'sum', 'Sales amount': 'mean'})
print(df3)

Iterative grouping

Pandas supports the operation of iterative grouping, you can see the effect of iterative grouping through the following example:

for group, data in ('Store Name'):
    print(group)  # Key values for the grouping
    print(data)  # All data belonging to the group

conditional filtering

Filter groups based on conditions:

df4 = ('Store Name').filter(lambda x: x['Sales amount'].sum() > 300)
print(df4)

Converting Groups and Sorting Groups

Finally, we demonstrated the conversion of grouped data and the operation of group sorting:

df1['NewColumn'] = ('Store Name')['Number of sales'].transform(lambda x:())
print(df1)

arrange in order

df5 = ('Store Name').sum().sort_values('Number of sales', ascending=True)
print(df5)

The above is a detailed introduction to Pandas grouping and aggregation operations, through these sample code and explanations, I believe the reader has a more in-depth understanding of the grouping and aggregation operations in Pandas.

Summary: In the data analysis, data grouping and aggregation is a common and important operation , Pandas provides a wealth of features to achieve this purpose , including single-column grouping , multi-column grouping , custom aggregation functions , iterative grouping , data export , conditional filtering , group conversion and group sorting and other operations , to meet most of the data analysis needs.

Full Code

import pandas as pd
import numpy as np

# Read two Excel files
df1 = pd.read_excel('C:\\\\ Users\\\\liuchunlin2\\\\\ Desktop\\\ data')

# Use individual columns for grouping and apply aggregation functions
df=('Store Name', as_index=False).sum()
#df=('Store name', as_index=False).aggregate({'Number of sales': 'sum'})
print(df)

# Use multiple columns for grouping and apply aggregation functions:
df2=(['Store Name','Order number'], as_index=False).sum()
print(df2)

# Define custom aggregation functions
def custom_agg(x):
    return () - ()
# Aggregate 'Column2' with custom aggregator functions
result = ('Store Name', as_index=False)['Number of sales'].agg(custom_agg)
print(result)

# Apply multiple aggregation functions at the same time
df3=('Store Name', as_index=False).agg({'Number of sales': 'sum', 'Sales amount': 'mean'})
print(df3)

# Iterative grouping
for group, data in ('Store Name'):
    print(group)  # Key values for the grouping
    print(data)  # All data belonging to the group

df3.to_excel('', index=False)
print('This is a data segmentation line')

# Filter groups based on conditions
df4=('Store Name').filter(lambda x: x['Sales amount'].sum() > 300)
print(df4)

# Convert groups
df1['NewColumn'] = ('Store Name')['Number of sales'].transform(lambda x:())  # Convert 'Column2' within each grouping
#df=('Store name', as_index=False)['Number of sales'].transform('sum')
print(df1)

# Group sorting
df5=('Store Name').sum().sort_values('Number of sales', ascending=True)  # ascending=True ascending=False descending
print(df5)

to this article on the Python Pandas grouping aggregation operation details of the article is introduced to this, more related Pandas grouping aggregation content, please search for my previous posts or continue to browse the following related articles I hope that you will support me in the future more!