I was wandering around a fish two days ago. I originally wanted to find a second-hand mechanical keyboard, but as I brushed it, I suddenly saw someone selling it-Word Batch To PDF Widget, it is quite popular and the price is not high, but the sales volume is surprisingly high. A lot of people in the comments are praising "easy" and "I finally don't need to order one by one".
To be honest, I was stunned at that time-
I can finish this function in ten minutes using Python!
Then I searched for other gadgets, such as pdf to Word, Word to pictures, Word to add watermarks, etc.
What a good guy, the office automation gadgets that Sister Hua used to teach everyone can be sold for money!
Let's replicate one todayWord Batch To PDF Widget, upgrade some functions and make a more silky version.
You can write it yourself after reading it.
The idea is clear: Word to PDF is not that complicated
Don’t think that this function sounds quite "high-end", in fact, what you do is essentially-
Open a bunch of Word documents with programs and save them to PDF.
In other words, this job is essentially a "batch processing". It's perfect to do it in Python.
The tools we need arepython-docx
? NoNoNo—This library does not support saving as PDF. The real protagonist is actually:
-
: Used to operate Word applications (requires Windows system + Office installed)
- Or cross-platform gameplay, useLibreOffice + subprocess, but today we will first talk about the most stable and simple way: use Word to work.
Code: Word to PDF script that can run in just a few lines
OK, let's get to the point and get to the most basic version first:
import os import def word_to_pdf(input_path, output_path): word = ("") = False # No pop-up window, run in the background doc = (input_path) (output_path, FileFormat=17) #17 is PDF format () () # Example usageword_to_pdf("C:/Users/your username/Desktop/test document.docx", "C:/Users/your username/Desktop/test document.pdf")
A few explanations:
-
Dispatch("")
It is to open the Word app; -
FileFormat=17
It's telling it "Hey, I'm going to save it as a PDF"; - The ending
Quit()
It's very important, otherwise Word may be hang in the background and occupy resources. - If your computer is installedWPS,
Dispatch("")
Change it hereDispatch("")
, otherwise there will be an error
Isn't it very simple? Even my cat understands it.
Extension: Supports batch conversion, kills out a whole folder at once!
The painful point of many people is that "too many documents, it's too troublesome to transfer one by one."
Well, let's make a batch version and let it turn all in one go:
def batch_convert(folder_path): word = ("") = False for file in (folder_path): if (".doc") or (".docx"): doc_path = (folder_path, file) pdf_path = (doc_path)[0] + ".pdf" try: doc = (doc_path) (pdf_path, FileFormat=17) () print(f"✅ Conversion successfully:{file}") except Exception as e: print(f"❌ Conversion failed:{file},reason:{e}") ()
How to use:
batch_convert(r"C:\Users\your username\Desktop\word folder")
Common pitfalls, Sister Hua will help you avoid them
It is simple and not difficult to write.The difficulty is compatibility and details。
1. The system must be Windows, and MS Office must be installed
The bottom layer of this thing is actually using COM to call the function of Word, so it is not possible to use it without installing Word.
2. If there are macros or protected in the document, it may not be able to transfer it.
Some documents will pop up and prompt the macro or password when opening them. If you have to manually change the settings, the program cannot escape.
3. The file name should not be too long and the path should not have Chinese/spaces.
Sometimes the path is too strange, Word will not be opened or transferred, so it is recommended to put it in a pure English folder.
Add extra material
- Automatically generate timestamp folder + output log
- Automatically obtain the Word file in the directory where the script is located (no manual path required for user input)
- Determine whether Office (Word) or WPS is installed in the computer, and automatically select the correct call method
- Package and sell
Generate timestamp folder
def gen_output_folder(): folder = ((__file__)) timestamp = ().strftime("%Y%m%d_%H%M%S") output_folder = (folder, f"pdf_{timestamp}") (output_folder, exist_ok=True) return output_folder
Automatically obtain Word files in the current script directory
This is too simple:
import os def get_word_files_from_current_folder(): folder = ((__file__)) word_files = [] for file in (folder): if (".doc") or (".docx"): word_files.append((folder, file)) return word_files
Methods to detect Office and WPS
We can try()
To determine whether these two programs exist.
import def detect_office_or_wps(): try: word = ("") return "office" except: try: wps = ("") return "wps" except: return None
Automatically select the engine and convert in batches
import os import def convert_word_to_pdf_auto(input_path, output_path, engine): if engine == "office": app = ("") elif engine == "wps": app = ("") else: print("❌ No available Office or WPS was detected") return = False try: doc = (input_path) (output_path, FileFormat=17) () print(f"✅ Conversion successfully:{input_path}") except Exception as e: print(f"❌ Conversion failed:{input_path},reason:{e}") try: () except: print("⚠️ Quit is not supported in the current environment, skip exit.")
Integrate all content and get all Word files in the directory where the script is located in one click
def batch_convert_here(): engine = detect_office_or_wps() if not engine: print("😭The system does not have Office or WPS installed, and it cannot be converted") return folder = ((__file__)) word_files = get_word_files_from_current_folder() if not word_files: print("🤷♀️ No Word files were found in the current folder") return output_folder = (folder, "pdf output") (output_folder, exist_ok=True) for word_file in word_files: filename = ((word_file))[0] pdf_path = (output_folder, f"{filename}.pdf") convert_word_to_pdf_auto(word_file, pdf_path, engine) print("🎉 All files are converted! PDF is in the 'pdf output' folder")
Running method (placed at the end of the script):
if __name__ == "__main__": batch_convert_here()
Made EXE for users of novice (pyinstaller)
In the last step, package our script into.exe
, throw a fish to sell for money (manual dog head)
The command is just one sentence:
pyinstaller -F
Complete code
import os import import sys import datetime def get_real_path(): """Path acquisition for compatible development and packaging environments""" if getattr(sys, 'frozen', False): base_dir = () # directory where the EXE file is located [1,7](@ref) else: base_dir = ((__file__)) return base_dir # Generate timestamp folderdef gen_output_folder(folder): # folder = ((__file__)) timestamp = ().strftime("%Y%m%d_%H%M%S") output_folder = (folder, f"pdf_{timestamp}") (output_folder, exist_ok=True) return output_folder # Automatically get Word files in the current script directorydef get_word_files_from_current_folder(folder): # folder = ((__file__)) word_files = [] for file in (folder): if (".doc") or (".docx"): word_files.append((folder, file)) return word_files # Methods to detect Office and WPSdef detect_office_or_wps(): try: word = ("") return "office" except: try: wps = ("") return "wps" except: return None # Automatically select the engine and convert it in batchesdef convert_word_to_pdf_auto(input_path, output_path, engine): if engine == "office": app = ("") elif engine == "wps": app = ("") else: print("No Office or WPS available was detected") return = False try: doc = (input_path) (output_path, FileFormat=17) () print(f"Conversion successfully:{input_path}") except Exception as e: print(f"Conversion failed:{input_path},reason:{e}") try: () except: print("Quit is not supported in the current environment, skip exit.") # Main functiondef batch_convert_here(): engine = detect_office_or_wps() if not engine: print("The system does not have Office or WPS installed, and it cannot be converted") return folder = get_real_path() word_files = get_word_files_from_current_folder(folder) if not word_files: print("No Word files were found in the current folder") return output_folder = gen_output_folder(folder) for word_file in word_files: filename = ((word_file))[0] pdf_path = (output_folder, f"{filename}.pdf") convert_word_to_pdf_auto(word_file, pdf_path, engine) print("All files are converted! PDF is in the 'output_folder' folder") if __name__ == "__main__": try: batch_convert_here() print("Press Enter to exit...") input() # Wait for the user to press Enter except Exception as e: print(e) print("The program runs incorrectly, press Enter to exit...") input() # Wait for the user to press Enter
You may think: "Isn't this just a few dozen lines of code? Will anyone buy it if you sell it?"
I thought so at first. Later I figured it out that many buyers on a certain fish don’t understand technology at all. What they care about is:
- Can it be done with one click?
- Wouldn't it be too complicated?
- Save or not?
So,Write tools + provide instructions + packaging, these constitute "products".
Sometimes we programmers underestimate our abilities too much - in fact, the scripts you write can really solve the problems of many people.
This is the article about Python's implementation of Word batch to PDF. This is the end of this article. For more related Python Word batch to PDF content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!