In daily office automation, especially when processing large amounts of data, merging multiple Excel tables is a common and tedious task. Fortunately, with the powerful libraries in the Python language, we can easily automate this process. This article will show you how to use Python to merge multiple Excel tables, saving time and improving productivity.
Why choose Python automation
Python has strong data processing capabilities, especially in data analysis and file operation. With libraries such as pandas and openpyxl, we can read, process and merge Excel files very efficiently. The advantages of using Python automation over manual operations include:
Improve efficiency: batch processing of large numbers of Excel files without manual operations.
Reduce error rate: Avoid errors caused by human negligence.
Reusable: After the code is written in one go, it can be repeatedly used for merging different files or tables.
Strong flexibility: It can clean, filter, sort data and other complex operations.
Target
Our goal is to merge data from multiple Excel files into a new Excel file, and all data will be appended to a worksheet. The specific operations are as follows:
Read multiple Excel files: Read data from multiple Excel files into Python.
Merge data: Merge this data into a new DataFrame.
Save the result: Save the merged data to a new Excel file.
Merge multiple Excel files using Python
We will use the pandas and openpyxl libraries to accomplish this. pandas is suitable for reading and processing of data, while openpyxl is suitable for manipulating Excel files.
Install the required libraries
First, make sure you have the following Python libraries installed:
pip install pandas openpyxl
Sample code
Suppose you have multiple Excel files, the file structure is as follows:
There is a worksheet in each file that contains data with the same structure (the column names are the same).
1. Import the library
import pandas as pd import os
2. Read multiple Excel files and merge
We use the os module to iterate through all Excel files in the specified directory and read data through pandas. Merge the data from each file into a large DataFrame.
def merge_excel_files(input_folder, output_file): # Get all Excel files in the folder all_files = [f for f in (input_folder) if ('.xlsx')] # Initialize an empty DataFrame to store merged data combined_df = () # traverse all files, read and merge one by one for file in all_files: file_path = (input_folder, file) print(f"Processing files: {file_path}") # Read Excel files df = pd.read_excel(file_path) # Merge data combined_df = ([combined_df, df], ignore_index=True) # Save the merged data to a new Excel file combined_df.to_excel(output_file, index=False) print(f"Merge is completed,The result has been saved to: {output_file}")
3. Call the function and run it
Call the merge_excel_files function above and pass in the folder path and the output file path:
# Specify the input folder path and the output file pathinput_folder = 'path_to_your_excel_files' # Replace with your folder pathoutput_file = 'merged_output.xlsx' # Output file path # Call merge functionmerge_excel_files(input_folder, output_file)
Code description
Get the file list: Get all .xlsx files in the specified directory by getting it.
Read and merge data: Use pandas.read_excel to read the data of each Excel file and use the method to merge the data into a large DataFrame. ignore_index=True Ensure that the merged data will not be indexed repeatedly.
Save the merge result: Finally, save the merged data to a new Excel file, using the to_excel method.
Execution results
After executing the above code, you will see the following output:
Processing file: path_to_your_excel_files/
Processing file: path_to_your_excel_files/
Processing file: path_to_your_excel_files/
The merge is completed, and the result has been saved to: merged_output.xlsx
The merged data will be saved to the merged_output.xlsx file.
summary
Through Python's pandas library, we can easily implement the automated task of merging multiple Excel files. With just a small amount of code, you can merge the data from multiple worksheets into a complete file, greatly improving work efficiency.
Using Python for office automation will not only reduce repetitive work, but also allow you to focus on more valuable work.
This is the article about merging multiple Excel in Python automation office. For more related merging multiple Excel content in Python, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!