Preface
Pivot Table is used to reshape and summarize data during data analysis and processing.
What is a pivot table?
PivotTables are data aggregation tools that rearrange raw data in a user-defined way to make it easier to analyze and visualize. Typically, the goal of a pivot table is to aggregate, summarize, and cross-analyze the data to gain insights into the dataset.
Steps to use
1. Introduce necessary libraries
import pandas as pd import as plt
2. Read data
# Read the datasetdata = pd.read_csv('your_dataset.csv')
3. Create a Pivot Table
Using Pandaspivot_table()
Functions create pivot tables. This function accepts multiple parameters, including the dataset, columns to be analyzed, row indexes, column indexes, and summary methods.
# Create a Pivot Tablepivot_table = pd.pivot_table(data, values='value_to_summarize', index='row_index_column', columns='column_index_column', aggfunc='sum')
in:
-
values
It is a column that needs to be summarized. -
index
It is the row index, which determines the rows of the pivot table. -
columns
It is the column index, which determines the columns of the pivot table. -
aggfunc
It is a function used for summary, which can be ‘sum’, ‘mean’, ‘count’, etc.
4. View the Pivot Table
print(pivot_table)
Sample code
import pandas as pd # Read sample datasetdata = pd.read_csv('/datasciencedojo/datasets/master/') # Create a Pivot Tablepivot_table = pd.pivot_table(data, values='Fare', index='Pclass', columns='Sex', aggfunc='mean') # Print Pivot Tableprint(pivot_table)
Summarize
By rationally setting row indexes, column indexes and summary methods, you can quickly generate pivot tables suitable for data analysis based on different needs.
This is the introduction to this article about the specific use of Pandas Pivot Table. For more related Pandas Pivot Table content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!