How to copy Word document styles using python-docx library

introduction

In daily office work, we often need to handle tasks such as format adjustment and content update of Word documents. Python and its rich third-party libraries provide strong support for developers who want to automate these tasks through programming. This article will introduce how to use the python-docx library to copy the content and style of a Word document, and show how to use this method to automate the document content.

Environmental preparation

First, make sure you have installed the python-docx library. If it has not been installed, you can install it through the following command:

pip install python-docx

Main functions implementation

Copy paragraph and text box styles: By definitioncopy_paragraph_styleFunction, we can copy the style of an old paragraph or text box to a newly created paragraph or text box.
Identify page breaks：is_page_breakFunctions help us identify whether page breaks are included between document elements, which is very important for maintaining consistency in document layout.
Cloning paragraphs and tables:passclone_paragraphandclone_tableFunctions, we can create new paragraphs or tables based on paragraphs or tables in old documents and retain the original style settings.
Copy cell borders: In order to make the newly generated table look consistent with the original table, we implementedcopy_cell_bordersFunction to copy the border style of each cell.
Copy the full document: Finally, byclone_documentFunctions, we can copy the content and styles of the entire document into a new Word document.

Sample code

Here is a simplified version of the core code example showing how to extract content from an old document and create a new document:

from docx import Document
# Assume that other required imports are included here
def clone_document(old_doc_path, new_doc_path, out_text_list):
    try:
        # Load old documents and create new documents        old_doc = Document(old_doc_path)
        new_doc = Document()
        
        # Copy the main content        elements = old_doc.
        para_index = 0
        table_index = 0
        index = 0
        
        while index &lt; len(elements):
            element = elements[index]
            if ('p'):
                # Processing paragraphs...                para_index += 1
            elif ('tbl'):
                # Processing form...                table_index += 1
            index += 1
        
        # Save new document        new_doc.save(new_doc_path)
        print(f"The document has been saved to：{new_doc_path}")
    except Exception as e:
        print(f"An error occurred while copying a document：{e}")

in conclusion

Through the above method, we can efficiently copy the content and style of Word documents, which provides an effective solution for document processing automation. Of course, depending on actual needs, you can further expand and improve this basic framework, such as adding support for more styles, optimizing performance, etc.

I hope this article can provide you with valuable reference and help you process Word documents more efficiently in your daily work.

Generate file code

from docx import Document
from  import WD_BREAK
from  import OxmlElement
from  import qn
from copy_word_only_text_json import clone_document as gen_to_list


def copy_paragraph_style(run_from, run_to):
    """Copy run style"""
    run_to.bold = run_from.bold
    run_to.italic = run_from.italic
    run_to.underline = run_from.underline
    run_to. = run_from.
    run_to. = run_from.
    run_to. = run_from.
    run_to.font.all_caps = run_from.font.all_caps
    run_to. = run_from.
    run_to. = run_from.


def is_page_break(element):
    """Judge whether an element is a page break (after a paragraph or table)"""
    if ('p'):
        for child in element:
            if ('br') and (qn('type')) == 'page':
                return True
    elif ('tbl'):
        # There may be page breaks after the table (judged by the next element)        if () is not None:
            next_element = ()
            if next_element.('p'):
                for child in next_element:
                    if ('br') and (qn('type')) == 'page':
                        return True
    return False


def clone_paragraph(old_para, new_doc, out_text_list):
    """Create a new paragraph from an old paragraph"""
    new_para = new_doc.add_paragraph()
    if old_para.style:
        new_para.style = old_para.style
    for old_run in old_para.runs:

        new_run = new_para.add_run(out_text_list.pop(0))
        copy_paragraph_style(old_run, new_run)
    new_para.alignment = old_para.alignment
    return new_para


def copy_cell_borders(old_cell, new_cell):
    """Copy the border style of the cell"""
    old_tc = old_cell._tc
    new_tc = new_cell._tc

    old_borders = old_tc.xpath('.//w:tcBorders')
    if old_borders:
        old_border = old_borders[0]
        new_border = OxmlElement('w:tcBorders')

        border_types = ['top', 'left', 'bottom', 'right', 'insideH', 'insideV']
        for border_type in border_types:
            old_element = old_border.find(f'.//w:{border_type}', namespaces={
                'w': '/wordprocessingml/2006/main'
            })
            if old_element is not None:
                new_element = OxmlElement(f'w:{border_type}')
                for attr, value in old_element.():
                    new_element.set(attr, value)
                new_border.append(new_element)

        tc_pr = new_tc.get_or_add_tcPr()
        tc_pr.append(new_border)


def clone_table(old_table, new_doc, out_text_list):
    """Create a new form from an old form"""
    new_table = new_doc.add_table(rows=len(old_table.rows), cols=len(old_table.columns))
    if old_table.style:
        new_table.style = old_table.style

    for i, old_row in enumerate(old_table.rows):
        for j, old_cell in enumerate(old_row.cells):
            new_cell = new_table.cell(i, j)
            for paragraph in new_cell.paragraphs:
                new_cell._element.remove(paragraph._element)
            for old_paragraph in old_cell.paragraphs:
                new_paragraph = new_cell.add_paragraph()
                for old_run in old_paragraph.runs:

                    new_run = new_paragraph.add_run(out_text_list.pop(0))
                    copy_paragraph_style(old_run, new_run)
                new_paragraph.alignment = old_paragraph.alignment
            copy_cell_borders(old_cell, new_cell)

    for i, col in enumerate(old_table.columns):
        if  is not None:
            new_table.columns[i].width = 

    return new_table


def clone_document(old_doc_path, new_doc_path, out_text_list ):
    # global out_text_list

    try:
        old_doc = Document(old_doc_path)
        new_doc = Document()

        # # Copy section breaks and header footer        # for old_section in old_doc.sections:
        #     new_section = new_doc.add_section(start_type=old_section.start_type)
        #     new_section.left_margin = old_section.left_margin
        #     new_section.right_margin = old_section.right_margin
        # # Other section breaking attributes...        #
        # # header        #     for para in old_section.:
        #         new_para = new_section.header.add_paragraph()
        #         for run in :
        #             new_run = new_para.add_run()
        #             copy_paragraph_style(run, new_run)
        #         new_para.alignment = 
        #
        #     # footer        #     for para in old_section.:
        #         new_para = new_section.footer.add_paragraph()
        #         for run in :
        #             new_run = new_para.add_run()
        #             copy_paragraph_style(run, new_run)
        #         new_para.alignment = 

        # Copy the main content        elements = old_doc.
        para_index = 0
        table_index = 0
        index = 0

        while index &lt; len(elements):
            element = elements[index]
            if ('p'):
                old_para = old_doc.paragraphs[para_index]
                clone_paragraph(old_para, new_doc, out_text_list)
                para_index += 1
                index += 1
            elif ('tbl'):
                old_table = old_doc.tables[table_index]
                clone_table(old_table, new_doc, out_text_list)
                table_index += 1
                index += 1
            elif ('br') and (qn('type')) == 'page':
                if index &gt; 0:
                    new_doc.add_paragraph().add_run().add_break(WD_BREAK.PAGE)
                index += 1
            else:
                index += 1

            # Check page breaks            if index &lt; len(elements) and is_page_break(elements[index]):
                if index &gt; 0:
                    new_doc.add_paragraph().add_run().add_break(WD_BREAK.PAGE)
                index += 1
        if new_doc_path:
            new_doc.save(new_doc_path)
            print(f"The document has been saved to：{new_doc_path}")
        else:
            return out_text_list
    except Exception as e:
        print(f"An error occurred while copying a document：{e}")


#User Exampleif __name__ == "__main__":
    out = gen_to_list('.docx', '')
    if out:
        print("Document content:\n", out, """Please change the document content according to user requirements.，Without changing the order，And do not change the number of contents，Finally, the contentlist Output to the givenjsonmiddle
         ```json
         {"Output":[]}
         ```
         User input:Please polish
         """)

        print("Request llm")
        print("Extract json")
    print("Fill in template")

    out = clone_document('.docx', 'only_text.docx',out)

Generate text list code

from docx import Document
from  import  WD_BREAK
from  import OxmlElement
from  import qn




def copy_paragraph_style(run_from, run_to):
    """Copy run style"""
    run_to.bold = run_from.bold
    run_to.italic = run_from.italic
    run_to.underline = run_from.underline
    run_to. = run_from.
    run_to. = run_from.
    run_to. = run_from.
    run_to.font.all_caps = run_from.font.all_caps
    run_to. = run_from.
    run_to. = run_from.


def is_page_break(element):
    """Judge whether an element is a page break (after a paragraph or table)"""
    if ('p'):
        for child in element:
            if ('br') and (qn('type')) == 'page':
                return True
    elif ('tbl'):
        # There may be page breaks after the table (judged by the next element)        if () is not None:
            next_element = ()
            if next_element.('p'):
                for child in next_element:
                    if ('br') and (qn('type')) == 'page':
                        return True
    return False


def clone_paragraph(old_para, new_doc,out_text_list):
    """Create a new paragraph from an old paragraph"""
    new_para = new_doc.add_paragraph()
    if old_para.style:
        new_para.style = old_para.style
    for old_run in old_para.runs:
        out_text_list.append(old_run.text)
        new_run = new_para.add_run(old_run.text)
        copy_paragraph_style(old_run, new_run)
    new_para.alignment = old_para.alignment
    return new_para


def copy_cell_borders(old_cell, new_cell):
    """Copy the border style of the cell"""
    old_tc = old_cell._tc
    new_tc = new_cell._tc

    old_borders = old_tc.xpath('.//w:tcBorders')
    if old_borders:
        old_border = old_borders[0]
        new_border = OxmlElement('w:tcBorders')

        border_types = ['top', 'left', 'bottom', 'right', 'insideH', 'insideV']
        for border_type in border_types:
            old_element = old_border.find(f'.//w:{border_type}', namespaces={
                'w': '/wordprocessingml/2006/main'
            })
            if old_element is not None:
                new_element = OxmlElement(f'w:{border_type}')
                for attr, value in old_element.():
                    new_element.set(attr, value)
                new_border.append(new_element)

        tc_pr = new_tc.get_or_add_tcPr()
        tc_pr.append(new_border)


def clone_table(old_table, new_doc,out_text_list):
    """Create a new form from an old form"""
    new_table = new_doc.add_table(rows=len(old_table.rows), cols=len(old_table.columns))
    if old_table.style:
        new_table.style = old_table.style

    for i, old_row in enumerate(old_table.rows):
        for j, old_cell in enumerate(old_row.cells):
            new_cell = new_table.cell(i, j)
            for paragraph in new_cell.paragraphs:
                new_cell._element.remove(paragraph._element)
            for old_paragraph in old_cell.paragraphs:
                new_paragraph = new_cell.add_paragraph()
                for old_run in old_paragraph.runs:
                    out_text_list.append(old_run.text)
                    new_run = new_paragraph.add_run(old_run.text)
                    copy_paragraph_style(old_run, new_run)
                new_paragraph.alignment = old_paragraph.alignment
            copy_cell_borders(old_cell, new_cell)

    for i, col in enumerate(old_table.columns):
        if  is not None:
            new_table.columns[i].width = 

    return new_table


def clone_document(old_doc_path, new_doc_path):
    # global out_text_list
    out_text_list = []
    try:
        old_doc = Document(old_doc_path)
        new_doc = Document()

        # # Copy section breaks and header footer        # for old_section in old_doc.sections:
        #     new_section = new_doc.add_section(start_type=old_section.start_type)
        #     new_section.left_margin = old_section.left_margin
        #     new_section.right_margin = old_section.right_margin
        # # Other section breaking attributes...        #
        # # header        #     for para in old_section.:
        #         new_para = new_section.header.add_paragraph()
        #         for run in :
        #             new_run = new_para.add_run()
        #             copy_paragraph_style(run, new_run)
        #         new_para.alignment = 
        #
        #     # footer        #     for para in old_section.:
        #         new_para = new_section.footer.add_paragraph()
        #         for run in :
        #             new_run = new_para.add_run()
        #             copy_paragraph_style(run, new_run)
        #         new_para.alignment = 

        # Copy the main content        elements = old_doc.
        para_index = 0
        table_index = 0
        index = 0

        while index &lt; len(elements):
            element = elements[index]
            if ('p'):
                old_para = old_doc.paragraphs[para_index]
                clone_paragraph(old_para, new_doc,out_text_list)
                para_index += 1
                index += 1
            elif ('tbl'):
                old_table = old_doc.tables[table_index]
                clone_table(old_table, new_doc,out_text_list)
                table_index += 1
                index += 1
            elif ('br') and (qn('type')) == 'page':
                if index &gt; 0:
                    new_doc.add_paragraph().add_run().add_break(WD_BREAK.PAGE)
                index += 1
            else:
                index += 1

            # Check page breaks            if index &lt; len(elements) and is_page_break(elements[index]):
                if index &gt; 0:
                    new_doc.add_paragraph().add_run().add_break(WD_BREAK.PAGE)
                index += 1
        if new_doc_path:
            new_doc.save(new_doc_path)
            print(f"The document has been saved to：{new_doc_path}")
        else:
            return out_text_list
    except Exception as e:
        print(f"An error occurred while copying a document：{e}")


#User Exampleif __name__ == "__main__":
    out=clone_document('Nanshan Three Defense Work Special Report.docx', '')
    if out:
        print("Document content:\n",out,"""Please change the document content according to user requirements.，Without changing the order，And do not change the number of contents，Finally, the contentlist Output to the givenjsonmiddle
         ```json
         {"Output":[]}
         ```
         User input:Please polish
         """)

        print("Request llm")
        print("Extract json")
    print("Fill in template")

This is the article about how to copy Word document styles using the Python-docx library. This is the end of this article. For more related content related to Python python-docx library, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!