SoFunction
Updated on 2025-04-08

Python automation office merge multiple Excel

In daily office automation, especially when processing large amounts of data, merging multiple Excel tables is a common and tedious task. Fortunately, with the powerful libraries in the Python language, we can easily automate this process. This article will show you how to use Python to merge multiple Excel tables, saving time and improving productivity.

Why choose Python automation

Python has strong data processing capabilities, especially in data analysis and file operation. With libraries such as pandas and openpyxl, we can read, process and merge Excel files very efficiently. The advantages of using Python automation over manual operations include:

Improve efficiency: batch processing of large numbers of Excel files without manual operations.

Reduce error rate: Avoid errors caused by human negligence.

Reusable: After the code is written in one go, it can be repeatedly used for merging different files or tables.

Strong flexibility: It can clean, filter, sort data and other complex operations.

Target

Our goal is to merge data from multiple Excel files into a new Excel file, and all data will be appended to a worksheet. The specific operations are as follows:

Read multiple Excel files: Read data from multiple Excel files into Python.

Merge data: Merge this data into a new DataFrame.

Save the result: Save the merged data to a new Excel file.

Merge multiple Excel files using Python

We will use the pandas and openpyxl libraries to accomplish this. pandas is suitable for reading and processing of data, while openpyxl is suitable for manipulating Excel files.

Install the required libraries

First, make sure you have the following Python libraries installed:

pip install pandas openpyxl

Sample code

Suppose you have multiple Excel files, the file structure is as follows:



There is a worksheet in each file that contains data with the same structure (the column names are the same).

1. Import the library

import pandas as pd 
import os

2. Read multiple Excel files and merge

We use the os module to iterate through all Excel files in the specified directory and read data through pandas. Merge the data from each file into a large DataFrame.

def merge_excel_files(input_folder, output_file):
    # Get all Excel files in the folder    all_files = [f for f in (input_folder) if ('.xlsx')]
    
    # Initialize an empty DataFrame to store merged data    combined_df = ()
    
    # traverse all files, read and merge one by one    for file in all_files:
        file_path = (input_folder, file)
        print(f"Processing files: {file_path}")
        
        # Read Excel files        df = pd.read_excel(file_path)
        
        # Merge data        combined_df = ([combined_df, df], ignore_index=True)
    
    # Save the merged data to a new Excel file    combined_df.to_excel(output_file, index=False)
    print(f"Merge is completed,The result has been saved to: {output_file}")

3. Call the function and run it

Call the merge_excel_files function above and pass in the folder path and the output file path:

# Specify the input folder path and the output file pathinput_folder = 'path_to_your_excel_files'  # Replace with your folder pathoutput_file = 'merged_output.xlsx'         # Output file path 
# Call merge functionmerge_excel_files(input_folder, output_file)

Code description

Get the file list: Get all .xlsx files in the specified directory by getting it.

Read and merge data: Use pandas.read_excel to read the data of each Excel file and use the method to merge the data into a large DataFrame. ignore_index=True Ensure that the merged data will not be indexed repeatedly.

Save the merge result: Finally, save the merged data to a new Excel file, using the to_excel method.

Execution results

After executing the above code, you will see the following output:

Processing file: path_to_your_excel_files/
Processing file: path_to_your_excel_files/
Processing file: path_to_your_excel_files/
The merge is completed, and the result has been saved to: merged_output.xlsx

The merged data will be saved to the merged_output.xlsx file.

summary

Through Python's pandas library, we can easily implement the automated task of merging multiple Excel files. With just a small amount of code, you can merge the data from multiple worksheets into a complete file, greatly improving work efficiency.

Using Python for office automation will not only reduce repetitive work, but also allow you to focus on more valuable work.

This is the article about merging multiple Excel in Python automation office. For more related merging multiple Excel content in Python, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!