SoFunction
Updated on 2024-11-13

Example of converting Excel to Word in python

In the daily work, Python in the field of office automation is widely used, such as batch will be more than one Excel data to calculate and generate charts, batch will be more than one Excel in a fixed format into Word, or timed to generate files and send mail and other scenarios. In this article, a simple example of a small, brief description of Python in Excel and Word to convert each other's knowledge, for learning and sharing the use of, if there are deficiencies, please correct.

Relevant Knowledge Points

In this paper, the main Excel file through certain rules to convert Word documents, involving the following knowledge points:

xlrd module: mainly used for reading Excel files, the relevant content is as follows:

  • xlrd.open_workbook(self.excel_file) opens an Excel file and returns the document object, with the full path to Excel as the argument
  • book.sheet_by_name(self.sheet_name) gets the corresponding sheet page by name and returns the sheet object
  • Effective number of rows in the sheet page
  • Effective number of columns on sheet pages
  • sheet.row_values(0) return to Excel corresponding to the value of the first row of the sheet page to array return
  • sheet.cell_value(row, col) return the value of a cell

python-docx module: the main operation of Word documents, such as: tables, paragraphs and other related, related to the following:

  • Document word of the document object, on behalf of the entire word document
  • [0] Get Chapter
  • doc.add_section(start_type=WD_SECTION_START.CONTINUOUS) Add consecutive sections
  • doc.add_heading(third, level=2) add the title, level indicates the level, such as the second level of the title, return the title object
  • doc.add_paragraph(text='', style=None) add paragraph, return paragraph object
  • doc.add_table(rows=4, cols=5) adds a table and returns the table object
  • doc_table.style = "Table Grid" Sets the table style.
  • doc_table.rows[0].cells[1].merge(doc_table.rows[0].cells[4]) Merge cells
  • doc_table.rows[3].cells Get all the cells in a row of a table, returned as an array
  • head_cells[0].width = Cm(1.9) set column width in cm
  • doc_table.rows[i].cells[j].vertical_alignment = WD_CELL_VERTICAL_ALIGNMENT.
  • doc_table.add_row() adds a new row and returns the row object

Plug-in Installation

The plugin can be installed under the terminal panel of pycharm. python-docx install command is:pip install python-docx

The xlrd install command is:pip install xlrd As shown below:

Data source files

The data source is a series of Excel files with the same format, a total of seven columns, of which the first column to be intercepted by [/] split, the format is as follows:

core code

The core source code of this article, which is divided into three main parts:

Import the relevant module packages as shown below:

import xlrd
from docx import Document
from  import WD_ORIENTATION
from  import WD_PARAGRAPH_ALIGNMENT
from  import Pt, Cm, RGBColor
from  import qn
from  import WD_CELL_VERTICAL_ALIGNMENT

Read Excel as shown below:

def read_excel(self):
    """Read Excel"""
    book = xlrd.open_workbook(self.excel_file)
    sheet = book.sheet_by_name(self.sheet_name)
    nrows =  # of rows
    ncols =  # of columns
    datas = [] # Storing data
    # First column Title
    keys = sheet.row_values(0)
    for row in range(1, nrows):
      data = {} # Each line of data
      for col in range(0, ncols):
        value = sheet.cell_value(row, col) # Fetch the data for each cell
        # Replacement to special characters
        value = ('<', '').replace('>', '').replace('$', '')
        data[keys[col]] = value
        # Intercept the first column element
        if col == 0:
          first = '' # Intercepting elements 1st
          second = '' # Intercepting elements 2nd
          third = '' # Intercepting elements 3rd
          arrs = ('/').split('/') # Remove the first / and group by /
          if len(arrs) > 0:
            if len(arrs) == 1:
              first = arrs[0]
              second = first
              third = second
            elif len(arrs) == 2:
              first = arrs[0]
              second = arrs[1]
              third = second
            elif len(arrs) == 3:
              first = arrs[0]
              second = arrs[1]
              third = arrs[2]
            else:
              first = arrs[0]
              second = arrs[1]
              third = arrs[2]
          else:
            first = ('/')
            second = first
            third = second
          data['first'] = first
          data['second'] = second
          data['third'] = third
        # End of the first column of the intercept
      (data)
    return datas

Generate Word sections:

def write_word(self, datas):
    """Generate word file"""
    if len(datas) < 1:
      print('Excel has no content')
      return
    # Define word document objects
    doc = Document()
    # Add horizontal
    section = [0] # doc.add_section(start_type=WD_SECTION_START.CONTINUOUS) # add horizontal page of consecutive sections
     = WD_ORIENTATION.LANDSCAPE
    page_h, page_w = section.page_width, section.page_height
    section.page_width = page_w # Set the width of the horizontal paper
    section.page_height = page_h # Set the height of the horizontal paper
    # Set the font
    ['Normal']. = u'Song Style'
    ['Normal']._element.(qn('w:eastAsia'), u'Song Style')
    # Obtain part 3 (sector) and de-emphasize it
    data_third = []
    for data in datas:
      third = data['third']
      if data_third.count(third) == 0:
        data_third.append(third)
    for third in data_third:
      h2 = doc.add_heading(third, level=2) # Write department, secondary headings
      run = [0] # Text can be set via add_run or obtained via array
       = RGBColor(0, 0, 0)
       = u'Song Style'
      doc.add_paragraph(text='', style=None) # Add blank lines Line feeds
      # Start getting templates
      data_template = []
      for data in datas:
        if data['third'] == third:
          template = {'first': data['first'], 'Template name': data['Template name']}
          if data_template.count(template) == 0:
            data_template.append(template)
      # Get the template done
      # Traversing templates
      for template in data_template:
        h3 = doc.add_heading(template['Template name'], level=3) # Insert template name, level 3 title
        run = [0] # Text can be set via add_run or obtained via array
         = RGBColor(0, 0, 0)
         = u'Song Style'
        doc.add_paragraph(text='', style=None) # Row line feeds
        data_table = filter(
          lambda data: data['third'] == third and data['Template name'] == template['Template name'] and data['first'] ==
                 template['first'], datas)
        data_table = list(data_table)
        # Add new table with 4 rows and 5 columns
        doc_table = doc.add_table(rows=4, cols=5)
        doc_table.style = "Table Grid"
        doc_table. = Pt(9)
        doc_table. = 'Song Style'

        # Merge cells Assign values
        doc_table.rows[0].cells[1].merge(doc_table.rows[0].cells[4])
        doc_table.rows[1].cells[1].merge(doc_table.rows[1].cells[4])
        doc_table.rows[2].cells[1].merge(doc_table.rows[2].cells[4])
        doc_table.rows[0].cells[0].text = 'Process name:'
        doc_table.rows[0].cells[1].text = data_table[0]['Template name']
        doc_table.rows[1].cells[0].text = 'Users:'
        doc_table.rows[1].cells[1].text = data_table[0]['first']
        doc_table.rows[2].cells[0].text = 'Process description:'
        doc_table.rows[2].cells[1].text = data_table[0]['Process description']

        # Setting the title
        head_cells = doc_table.rows[3].cells # There are three more lines in front of it, special treatment
        head_cells[0].text = 'Node'
        head_cells[1].text = 'Node name'
        head_cells[2].text = 'Processors'
        head_cells[3].text = 'Treatment'
        head_cells[4].text = 'Jump information'
        # Set column widths
        head_cells[0].width = Cm(1.9)
        head_cells[1].width = Cm(4.83)
        head_cells[2].width = Cm(8.25)
        head_cells[3].width = Cm(2.54)
        head_cells[4].width = Cm(5.64)
        # Center column 1 horizontally and set the row height, and center all cells vertically.
        for i in range(0, 4):
          # Center horizontally
          p = doc_table.rows[i].cells[0].paragraphs[0]
           = WD_PARAGRAPH_ALIGNMENT.CENTER
          doc_table.rows[i].height = Cm(0.6) # Row height
          # Vertical centering
          for j in range(0, 5):
            doc_table.rows[i].cells[j].vertical_alignment = WD_CELL_VERTICAL_ALIGNMENT.CENTER

        # Generate tables and populate them with content
        row_num = 0
        for data in data_table:
          row = doc_table.add_row()
          row_cells = 
          row_cells[0].text = str(row_num + 1) # Serial number, needs to be converted to a string
          row_cells[1].text = data['Node name']
          row_cells[2].text = data['Approving officer']
          row_cells[3].text = data['Approval Method']
          row_cells[4].text = ''
          # Center horizontally
          p = row_cells[0].paragraphs[0]
           = WD_PARAGRAPH_ALIGNMENT.CENTER
           = Cm(0.6) # Row height
          # Vertical centering
          for j in range(0, 5):
            row_cells[j].vertical_alignment = WD_CELL_VERTICAL_ALIGNMENT.CENTER
          row_num = row_num + 1

        doc.add_paragraph(text='', style=None) # Row line feeds
    (self.word_file)

The above is python Excel to Word example of the details, more information about python Excel to Word please pay attention to my other related articles!