SoFunction
Updated on 2025-05-04

Project Practice for Implementing Pivot Tables in Excel with Pandas

introduction

In data analysis, pivot tables are a very powerful tool that helps us quickly summarize, analyze and visualize large amounts of data. While Excel provides built-in pivot table functionality, the Pandas library in Python provides a flexible and powerful alternative for scenarios where more complex operations or automation is required. This article will use specific examples to show how to use Pandas to implement pivot table functions similar to those in Excel.

Preparation

First make sure that the Pandas library is installed in your environment. If it has not been installed, you can quickly install it through the pip command:

pip install pandas openpyxl

Next, we will create a simulated real sales data form filesales_data.xlsx, and use Pandas to read the file, and then generate a pivot table.

Create simulated sales data

Assume ourssales_data.xlsxThe file contains the following data:

date Product Name Sales area Salesperson
2021-01-01 Product A 8000 East China Zhang San
2021-01-02 Product B 12000 North China Li Si
2021-01-03 Product C 5000 South China Wang Wu
2021-01-04 Product D 15000 East China Zhang San
2021-01-05 Product E 7000 Central China Li Si
2021-01-06 Product F 20000 North China Wang Wu
2021-01-07 Product G 9000 East China Zhang San
2021-01-08 Product H 3000 South China Li Si
2021-01-09 Product I 6000 Central China Wang Wu
2021-01-10 Product J 11000 East China Zhang San

Code implementation and output

First, we need to create this simulated data and save it to an Excel file, and then use Pandas to read and generate a pivot table.

import pandas as pd

# Create simulated datadata = {
    'date': ['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04', '2021-01-05',
            '2021-01-06', '2021-01-07', '2021-01-08', '2021-01-09', '2021-01-10'],
    'Product Name': ['Product A', 'Product B', 'Product C', 'Product D', 'Product E', 'Product F', 'Product G', 'Product H', 'Product I', 'Product J'],
    'Sales': [8000, 12000, 5000, 15000, 7000, 20000, 9000, 3000, 6000, 11000],
    'area': ['East China', 'North China', 'South China', 'East China', 'Central China', 'North China', 'East China', 'South China', 'Central China', 'East China'],
    'Salesperson': ['Zhang San', 'Li Si', 'Wang Wu', 'Zhang San', 'Li Si', 'Wang Wu', 'Zhang San', 'Li Si', 'Wang Wu', 'Zhang San']
}

# Save data to Excel filedf = (data)
df.to_excel('sales_data.xlsx', index=False)

# Reload datadf = pd.read_excel('sales_data.xlsx')

# Show the first few lines to check the dataprint("Raw Data:")
print(())

After running the above code, you will see the following output:

Raw data

Date Product Name Sales Area Sales Person
0  2021-01-01    Product A   8000   East China     Zhang San
1  2021-01-02    Product B  12000   North China     Li Si
2  2021-01-03    Product C   5000   South China      Wang Wu
3  2021-01-04    Product D  15000   East China    Zhang San
4  2021-01-05    Product E   7000   Central China     Li Si

Example 1: Aggregate sales by region and salesperson

Suppose we want to aggregate sales by region and salesperson and calculate the total sales for each combination.

# Generate a Pivot Tablepivot_table1 = pd.pivot_table(df, values='Sales', index=['area', 'Salesperson'], aggfunc='sum')

print("\nSummary sales by region and salesperson:")
print(pivot_table1)

Sales by region and salesperson:

Region Salesperson
East China  Zhang San     33000
North China Li Si     12000
Wang Wu   20000
South China Wang Wu       8000
Central China Li Si     13000
Wang Wu       6000
Name: Sales, dtype: int64

Example 2: Aggregate sales by region and display total sales for each region

Suppose we want to aggregate sales by region and show the total sales for each region.

# Generate a Pivot Tablepivot_table2 = pd.pivot_table(df, values='Sales', index=['area'], aggfunc='sum')

print("\nSummary sales by region:")
print(pivot_table2)

Sales by region:

area
North China    32000
East China    33000
South China     8000
Central China    13000
Name: Sales, dtype: int64

Example 3: Aggregate sales by region and salesperson and display average sales for each combination

Suppose we want to aggregate sales by region and salesperson and show the average sales for each combination.

# Generate a Pivot Tablepivot_table3 = pd.pivot_table(df, values='Sales', index=['area', 'Salesperson'], aggfunc='mean')

print("\nSummary sales by region and salesperson (average):")
print(pivot_table3)

Sales by region and salesperson (average):

Region Salesperson
East China  Zhang San    11000.0
North China Li Si    12000.0
Wang Wu    20000.0
South China Wang Wu     5500.0
Central China Li Si     9333.333333
Wang Wu   6000.0
Name: Sales, dtype: float64

Example 4: Aggregate sales by region and salesperson and display the maximum sales for each combination

Suppose we want to aggregate sales by region and salesperson and show the maximum sales for each combination.

# Generate a Pivot Tablepivot_table4 = pd.pivot_table(df, values='Sales', index=['area', 'Salesperson'], aggfunc='max')

print("\nSummary sales by region and salesperson:")
print(pivot_table4)Output result

Sales by region and salesperson (maximum):

Region Salesperson
East China  Zhang San    15000
North China Li Si    12000
Wang Wu     20000
South China Wang Wu     5000
Central China Li Si     7000
Wang Wu   6000
Name: Sales, dtype: int64

in conclusion

Through the above examples, we can see that the Pandas library provides very powerful and flexible functionality in handling pivot tables. Whether it is simple summary or complex multi-condition aggregation, Pandas can handle it easily. With these basic skills mastered, you will be able to manage and analyze your dataset more efficiently, thus gaining valuable insights faster. Hope this article helps you make better use of Pandas in your daily work!

This is the article about the project practice of using Pandas to implement pivot tables in Excel. For more related Pandas Excel pivot table content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!