1. Introduction
PPT is an efficient information display tool that is widely used in many fields such as education, business and design. PPT documents often contain rich picture content, which not only improve the visual effect, but also enhance the efficiency of information transmission. Extract these images from PPT and can be reused in other documents, brochures, websites, or social media content.
This article will describe how to use Python to automatically extract pictures from PowerPoint (PPT or PPTX) files. The main contents include extracting PPT background pictures (slide background pictures and slide template background pictures), extracting pictures from slide shapes, extracting pictures from the entire PPT document, and extracting relevant information about the picture, such as coordinate position, width and height.
2. Environment and tools
Before extracting images from PPT, you need to make sure that Python is installed on your computer. If not installed, go toPython official websiteDownload and install.
After the installation is completed, it needs to be installed for PythonLibrary, which is mainly used to generate, operate and convert PPT presentations. The installation steps are as follows:
- Open the terminal
- Enter the following command and press Enter:
pip install
3. Python extracts PPT background pictures
PowerPoint slides usually contain beautiful background images that may exist in a single slide or in a slide master (template). Extracting these background images is very useful for designers, educators, or users who need to reuse materials.
3.1 Extract slide background pictures
To extract the background image from the PPT slide, you can do it by following the steps:
- InitializationPresentationInstance of the class and use()Method loads PPT or PPTX files.
- passCollections traverse slides in the file.
- passAttributes determine whether the background fill type of each slide is picture fill.
- If filled with the picture, extract the background image and save it as an image file.
- passAttributes determine whether the background fill type of each slide is picture fill.
Implementation code:
from import * import os def extract_background_images_from_slides(ppt_path, output_folder): """Extract background images from slides""" presentation = Presentation() (ppt_path) (output_folder, exist_ok=True) for i, slide in enumerate(): bgd = if == : image_data = output_path = (output_folder, f"Slideshow background_{i}.png") image_data.(output_path) () #User Exampleextract_background_images_from_slides("Test.pptx", "picture")
3.2 Extract slide master background pictures
The steps for extracting background images from the slide master are similar to the above steps, except that the traversed collection is changed to. The specific steps are as follows:
- InitializationPresentation instance of class and use()Method loads PPT or PPTX files.
- passCollections traverse slide masters in files.
- passAttributes determine whether the background fill type of each slide master is picture fill.
- If filled with the picture, extract the background image and save it as an image file.
- passAttributes determine whether the background fill type of each slide master is picture fill.
Implementation code:
from import * import os def extract_background_images_from_slide_masters(ppt_path, output_folder): """Extract background images from slide master""" presentation = Presentation() (ppt_path) (output_folder, exist_ok=True) for i, slide_master in enumerate(): bgd = slide_master.SlideBackground if == : image_data = output_path = (output_folder, f"Slide master background_{i}.png") image_data.(output_path) () #User Exampleextract_background_images_from_slide_masters("Test.pptx", "picture")
4. Python extracts pictures from the shape of PPT slides
The pictures in the PPT slide may also exist in the form of shape objects, and the extraction steps are as follows:
- InitializationPresentationInstance of the class and use()Method loads PPT or PPTX files.
- passCollections traverse slides in the file.
- passThe collection goes through all shapes in each slide.
- Determine whether the shape isPictureShapeorSlidePicture Object.
- If it is a PictureShape or SlidePicture object, the image is extracted and saved as a picture file.
Implement code
from import * import os def extract_images_from_shapes(ppt_path, output_folder): """Extract pictures from slide shapes""" presentation = Presentation() (ppt_path) (output_folder, exist_ok=True) img_count = 0 for slide_index, slide in enumerate(): for shape_index, shape in enumerate(): if isinstance(shape, PictureShape): image_data = elif isinstance(shape, SlidePicture): image_data = else: continue img_count += 1 output_path = (output_folder, f"picture_{img_count}.png") image_data.(output_path) ()
5. Python extracts image information in PPT (such as coordinates, width and height, etc.)
When performing PPT document analysis or automation, you may need to obtain specific information about the image, such as:
- Coordinates (relative to the upper left corner of the slide)
- Dimensions (the width and height of the picture in pounds)
This information can be extracted through the following steps:
- InitializationPresentationInstance of the class and use()Method loads PPT or PPTX files.
- passCollections traverse slides in the file.
- passThe collection goes through all shapes in each slide.
- Determine whether the shape isPictureShapeorSlidePicture Object.
- If it is a PictureShape or SlidePicture object, it obtains the X/Y coordinates, width, height, and slideshow of the current picture.
Implement code
from import * def extract_image_metadata(ppt_path): """Get information about the picture in PPT (slide, coordinate position, width and height, etc.)""" presentation = Presentation() (ppt_path) for slide_index, slide in enumerate(): for shape_index, shape in enumerate(): if isinstance(shape, PictureShape) or isinstance(shape, SlidePicture): x = y = width = height = print(f"Slideshow {slide_index + 1},shape {shape_index + 1}:X={x}, Y={y}, width={width}, high={height}") () #User Exampleextract_image_metadata("Test.pptx")
6. Python extracts pictures from the entire PPT document
If you want to extract pictures from the entire PPT document, you can traversegather. The specific steps are as follows:
- InitializationPresentationInstance of the class and use()Method loads PPT or PPTX files.
- useCollections traverse pictures in PPT documents.
- Extract each image and save it as an image file.
Implement code
from import * import os def extract_images_from_presentation(ppt_path, output_folder): """Extract pictures from the entire PPT document""" presentation = Presentation() (ppt_path) (output_folder, exist_ok=True) for i, image in enumerate(): output_path = (output_folder, f"picture_{i}.png") (output_path) () #User Exampleextract_images_from_presentation("Test.pptx", "picture")
The above is all about extracting pictures and picture information from PPT using Python.
This is the article about using Python to extract pictures and picture information (such as coordinates, width and height) from PPT documents. For more relevant Python content to extract PPT pictures and picture information, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!