SoFunction
Updated on 2025-05-15

Complete process of weekly report automation based on Python + Word

1. Overview of technical solutions

An automated reporting solution is based on the following technical components:

  1. Python as the core programming language
  2. The python-docx library is used to handle Word documents
  3. pandas library is used for data processing and analysis
  4. matplotlib or plotly library is used for data visualization
  5. Word templates as the basic format of reports
    The advantage of this solution is that it retains the flexibility of the layout of Word documents and has nothing to do with it, and at the same time, it uses Python's powerful data processing capabilities to achieve automated generation of report content.

2. Environment preparation and dependency installation

You need to configure the Python environment and install the necessary libraries:

# Install the required libraries# It is recommended to install in a virtual environmentpip install python-docx pandas matplotlib plotly openpyxl

python-docx is a Python library for creating and updating Microsoft Word(.docx) files

3. Word template design principles

Designing a good Word template is the basis for automated reporting. The template should consider the following points:

  • Clear structure: including clear structures such as title, abstract, text, icon position, etc.
  • Reserved placeholder: Set specific placeholder marks where dynamic fill is required
  • Consistent format: use uniform font, color, paragraph style
  • Consider scalability: Some parts may need to be dynamically increased or decreased based on data

A typical weekly report template may contain the following sections:

  • Report title and time range
  • Summary of main indicators
  • Detailed data of each business line
  • Description of abnormal situations
  • Data Trend Icons
  • Work plan for next week

Manipulating Word Documents with python-docx

The python-docx library provides rich APIs to manipulate Word documents. Here are some basic operations:

from docx import Document
from  import Inches, Pt, RGBColor
from  import WD_ALIGN_PARAGRAPH

# Create a new Word documentdoc = Document()

# Add a titledoc.add_heading('Weekly Report: 2025-04-21', 0)

# Add paragraphp = doc.add_paragraph("Overall business operation this week:")
p.add_run('good').bold = True
p.add_run(', All main tables have grown steadily.  ')

# Add a tabletable = doc.add_table(rows=3, cols=3)

# Set the table headerheader_cells = [0].cells
header_cells[0].text = 'Indicator Name'
header_cells[1].text = 'Values ​​of this week'
header_cells[2].text = 'Momentum change'

# Fill in datadata_cells = [1].cells
data_cells[0].text = 'Sales'
data_cells[1].text = '¥1234567'
data_cells[2].text = '+12.3'

# Add an imagedoc.add_picture("", width=Inches(6), height=Inches(2))

# Save the document("weekly_report.docx")

Building data processing and acquisition modules

In practical applications, report data may come from multiple sources, such as databases, APIs, Excel files, etc. A flexible data acquisition and processing module is needed

#! /usr/bin/env/python3
# -*- coding=utf-8 -*-
# @Author: jack
# @Date  : 2025/04/21/17:16
from docx import Document
from  import Inches, Pt, RGBColor
from  import WD_ALIGN_PARAGRAPH
import pandas as pd
import  as plt
from datetime import datetime, timedelta


def get_report_period():
    """Determine the time frame for reporting"""
    today = ()
    # Assume that weekly reports cover last Monday to Sunday    last_month = today - timedelta(days=() + 7)
    last_sunday = last_month + timedelta(days=6)
    return last_month, last_sunday


def fetch_sales_data(start_date, end_date):
    """Get sales data from data sources"""
    # In actual application, this is a database query or API call    # Use simulation data as an example here    dates = pd.date_range(start=start_date, end=end_date)
    sales = [round(100000 + i * 5000 + i * i * 100) for i in range(len(dates))]
    return ({
        "date": dates,
        "sales": sales})


def calculate_kpi(df):
    """Calculate Key Performance Indicators"""
    total_sales = df["sales"].sum()
    avg_sales = df["sales"].mean()
    max_sales = df["sales"].max()
    max_sales_day = [df["sales"].idxmax(), "date"]
    # Calculate month-on-month change    # Suppose we have data from last week    last_week_sales = total_sales * 0.9  # Simulate data    sales_change = (total_sales - last_week_sales) / last_week_sales
    return {
        "total_sales": total_sales,
        "avg_sales": avg_sales,
        "max_sales": max_sales,
        "max_sales_day": max_sales_day,
        "sales_change": sales_change
    }


def generate_charts(df, output_path):
    """Generate data visualization chart"""
    (figsize=(10, 6))
    (df['date'], df['sales'], marker='o')
    ('Daily Sales Trend')
    ('date')
    ('Sales')
    (True)
    plt.tight_layout()
    (output_path)
    ()
    return output_path

Implement template filling logic

#! /usr/bin/env/python3
# -*- coding=utf-8 -*-
# @Author: jack
# @Date  : 2025/04/21/17:16
import os

from docx import Document
from  import Inches, Pt, RGBColor
from  import WD_ALIGN_PARAGRAPH
import pandas as pd
import  as plt
from datetime import datetime, timedelta


def get_report_period():
    """Determine the time frame for reporting"""
    today = ()
    # Assume that weekly reports cover last Monday to Sunday    last_month = today - timedelta(days=() + 7)
    last_sunday = last_month + timedelta(days=6)
    return last_month, last_sunday


def fetch_sales_data(start_date, end_date):
    """Get sales data from data sources"""
    # In actual application, this is a database query or API call    # Use simulation data as an example here    dates = pd.date_range(start=start_date, end=end_date)
    sales = [round(100000 + i * 5000 + i * i * 100) for i in range(len(dates))]
    return ({
        "date": dates,
        "sales": sales})


def calculate_kpis(df):
    """Calculate Key Performance Indicators"""
    total_sales = df["sales"].sum()
    avg_sales = df["sales"].mean()
    max_sales = df["sales"].max()
    max_sales_day = [df["sales"].idxmax(), "date"]
    # Calculate month-on-month change    # Suppose we have data from last week    last_week_sales = total_sales * 0.9  # Simulate data    sales_change = (total_sales - last_week_sales) / last_week_sales
    return {
        "total_sales": total_sales,
        "avg_sales": avg_sales,
        "max_sales": max_sales,
        "max_sales_day": max_sales_day,
        "sales_change": sales_change
    }


def generate_charts(df, output_path):
    """Generate data visualization chart"""
    (figsize=(10, 6))
    (df['date'], df['sales'], marker='o')
    ('Daily Sales Trend')
    ('date')
    ('Sales')
    (True)
    plt.tight_layout()
    (output_path)
    ()
    return output_path


def generate_report(template_path, output_path):
    """Main function to generate weekly reports"""
    # Get report time range    start_date, end_date = get_report_period()
    period_str = f"{start_date.strftime('%Y year %m month %d day')} to {end_date.strftime('%Y year %m month %d day')}"

    # Get and process data    sales_data = fetch_sales_data(start_date, end_date)
    kpis = calculate_kpis(sales_data)

    # Generate a chart    chart_path = generate_charts(sales_data, 'sales_trend.png')

    # Load Word templates    doc = Document(template_path)

    # Replace date in title    for paragraph in :
        if '{{report_period}}' in :
             = ('{{report_period}}', period_str)

    # Fill in KPI data    for paragraph in :
        if '{{total_sales}}' in :
             = ('{{total_sales}}', f"¥{kpis['total_sales']:,.2f}")
        if '{{sales_change}}' in :
            change_text = f"+{kpis['sales_change']:.2%}" if kpis['sales_change'] >= 0 else f"{kpis['sales_change']:.2%}"
             = ('{{sales_change}}', change_text)

    # Fill in table data    for table in :
        for row in :
            for cell in :
                for paragraph in :
                    if '{{avg_sales}}' in :
                         = ('{{avg_sales}}', f"¥{kpis['avg_sales']:,.2f}")
                    if '{{max_sales}}' in :
                         = ('{{max_sales}}', f"¥{kpis['max_sales']:,.2f}")
                    if '{{max_sales_day}}' in :
                        day_str = kpis['max_sales_day'].strftime('%Y year %m month %d day')
                         = ('{{max_sales_day}}', day_str)

    # Add a chart    for paragraph in :
        if '{{sales_chart}}' in :
            # Save the reference for the current paragraph            p = paragraph
            # Clear the placeholder text             = ""
            # Add an image in the same location            run = p.add_run()
            run.add_picture(chart_path, width=Inches(6))

    # Save the generated report    (output_path)
    print(f"Weekly report has been generated:{output_path}")
    return output_path


def main():
    # Template and output file paths    template_path = "weekly_report.docx"
    start_date, end_date = get_report_period()
    output_filename = f"Sales Weekly_{start_date.strftime('%Y%m%d')}_{end_date.strftime('%Y%m%d')}.docx"
    output_path = ("reports", output_filename)

    # Make sure the output directory exists    ("reports", exist_ok=True)

    # Generate a report    generate_report(template_path, output_path)


if __name__ == "__main__":
    main()

Advanced: Dynamic report content generation

In practical applications, the content of the report may need to be dynamically adjusted according to changes in the data. For example, when abnormal data is detected, additional instructions or warnings need to be added to the report. Here is an example of an extension to deal with dynamic content:

def add_dynamic_sections(doc, sales_data, kpis):
    """Dynamic addition of report content based on data situation"""
    # For example: When the sales growth rate exceeds 20%, add special instructions    if kpis['sales_change'] > 0.2:
        doc.add_heading('Statement of Significant Sales', level=2)
        p = doc.add_paragraph()
        p.add_run(f"Sales this week increased compared to last week{kpis['sales_change']:.2%},Significantly higher than expected。")
        p.add_run("The main growth points come from the following aspects:").bold = True
        
        # Add bullet list        doc.add_paragraph("Sales growth brought about by new product line online", style='List Bullet')
        doc.add_paragraph("The marketing campaign is effective", style='List Bullet')
        doc.add_paragraph("Increased orders for key customers", style='List Bullet')
    
    # Detect sales abnormal days    daily_avg = sales_data['sales'].mean()
    std_dev = sales_data['sales'].std()
    anomaly_days = sales_data[abs(sales_data['sales'] - daily_avg) > 2 * std_dev]
    
    ifnot anomaly_days.empty:
        doc.add_heading('Abnormal Sales Day Analysis', level=2)
        p = doc.add_paragraph("There was a significant abnormality in sales data on the following dates detected this week:")
        
        # Add exception day table        table = doc.add_table(rows=1, cols=3)
         = 'Table Grid'
        
        # Set the table header        header_cells = [0].cells
        header_cells[0].text = 'date'
        header_cells[1].text = 'Sales'
        header_cells[2].text = 'Difference from the mean'
        
        # Add data rows        for _, row in anomaly_days.iterrows():
            cells = table.add_row().cells
            cells[0].text = row['date'].strftime('%Y-%m-%d')
            cells[1].text = f"¥{row['sales']:,.2f}"
            deviation = (row['sales'] - daily_avg) / daily_avg
            cells[2].text = f"{deviation:.2%}"
            
        doc.add_paragraph("It is recommended to further investigate the causes of these abnormal situations so that appropriate business measures are taken.")

The above is the detailed content of the complete process of implementing weekly report automation based on Python + Word. For more information about Python Word weekly report automation, please pay attention to my other related articles!