SoFunction
Updated on 2025-04-23

Use Python to extract pictures and image information from PPT documents (such as coordinates, width and height, etc.)

1. Introduction

PPT is an efficient information display tool that is widely used in many fields such as education, business and design. PPT documents often contain rich picture content, which not only improve the visual effect, but also enhance the efficiency of information transmission. Extract these images from PPT and can be reused in other documents, brochures, websites, or social media content.

This article will describe how to use Python to automatically extract pictures from PowerPoint (PPT or PPTX) files. The main contents include extracting PPT background pictures (slide background pictures and slide template background pictures), extracting pictures from slide shapes, extracting pictures from the entire PPT document, and extracting relevant information about the picture, such as coordinate position, width and height.

2. Environment and tools

Before extracting images from PPT, you need to make sure that Python is installed on your computer. If not installed, go toPython official websiteDownload and install.

After the installation is completed, it needs to be installed for PythonLibrary, which is mainly used to generate, operate and convert PPT presentations. The installation steps are as follows:

  • Open the terminal
  • Enter the following command and press Enter:
pip install 

3. Python extracts PPT background pictures

PowerPoint slides usually contain beautiful background images that may exist in a single slide or in a slide master (template). Extracting these background images is very useful for designers, educators, or users who need to reuse materials.

3.1 Extract slide background pictures

To extract the background image from the PPT slide, you can do it by following the steps:

  • InitializationPresentationInstance of the class and use()Method loads PPT or PPTX files.
  • passCollections traverse slides in the file.
    • passAttributes determine whether the background fill type of each slide is picture fill.
      • If filled with the picture, extract the background image and save it as an image file.

Implementation code:

from  import *
import os
 
def extract_background_images_from_slides(ppt_path, output_folder):
    """Extract background images from slides"""
    presentation = Presentation()
    (ppt_path)
    (output_folder, exist_ok=True)
 
    for i, slide in enumerate():
        bgd = 
        if  == :
            image_data = 
            output_path = (output_folder, f"Slideshow background_{i}.png")
            image_data.(output_path)
 
    ()
 
#User Exampleextract_background_images_from_slides("Test.pptx", "picture")

3.2 Extract slide master background pictures

The steps for extracting background images from the slide master are similar to the above steps, except that the traversed collection is changed to. The specific steps are as follows:

  • InitializationPresentation instance of class and use()Method loads PPT or PPTX files.
  • passCollections traverse slide masters in files.
    • passAttributes determine whether the background fill type of each slide master is picture fill.
      • If filled with the picture, extract the background image and save it as an image file.

Implementation code:

from  import *
import os
 
def extract_background_images_from_slide_masters(ppt_path, output_folder):
    """Extract background images from slide master"""
    presentation = Presentation()
    (ppt_path)
    (output_folder, exist_ok=True)
 
    for i, slide_master in enumerate():
        bgd = slide_master.SlideBackground
        if  == :
            image_data = 
            output_path = (output_folder, f"Slide master background_{i}.png")
            image_data.(output_path)
 
    ()
 
#User Exampleextract_background_images_from_slide_masters("Test.pptx", "picture")

4. Python extracts pictures from the shape of PPT slides

The pictures in the PPT slide may also exist in the form of shape objects, and the extraction steps are as follows:

  • InitializationPresentationInstance of the class and use()Method loads PPT or PPTX files.
  • passCollections traverse slides in the file.
  • passThe collection goes through all shapes in each slide.
  • Determine whether the shape isPictureShapeorSlidePicture Object.
    • If it is a PictureShape or SlidePicture object, the image is extracted and saved as a picture file.

Implement code

from  import *
import os
 
def extract_images_from_shapes(ppt_path, output_folder):
    """Extract pictures from slide shapes"""
    presentation = Presentation()
    (ppt_path)
    (output_folder, exist_ok=True)
 
    img_count = 0
 
    for slide_index, slide in enumerate():
        for shape_index, shape in enumerate():
            if isinstance(shape, PictureShape):
                image_data = 
            elif isinstance(shape, SlidePicture):
                image_data = 
            else:
                continue
 
            img_count += 1
            output_path = (output_folder, f"picture_{img_count}.png")
            image_data.(output_path)
 
    ()

5. Python extracts image information in PPT (such as coordinates, width and height, etc.)

When performing PPT document analysis or automation, you may need to obtain specific information about the image, such as:

  • Coordinates (relative to the upper left corner of the slide)
  • Dimensions (the width and height of the picture in pounds)

This information can be extracted through the following steps:

  • InitializationPresentationInstance of the class and use()Method loads PPT or PPTX files.
  • passCollections traverse slides in the file.
  • passThe collection goes through all shapes in each slide.
  • Determine whether the shape isPictureShapeorSlidePicture Object.
    • If it is a PictureShape or SlidePicture object, it obtains the X/Y coordinates, width, height, and slideshow of the current picture.

Implement code

from  import *
 
def extract_image_metadata(ppt_path):
    """Get information about the picture in PPT (slide, coordinate position, width and height, etc.)"""
    presentation = Presentation()
    (ppt_path)
 
    for slide_index, slide in enumerate():
        for shape_index, shape in enumerate():
            if isinstance(shape, PictureShape) or isinstance(shape, SlidePicture):
                x = 
                y = 
                width = 
                height = 
                print(f"Slideshow {slide_index + 1},shape {shape_index + 1}:X={x}, Y={y}, width={width}, high={height}")
 
    ()
 
#User Exampleextract_image_metadata("Test.pptx")

6. Python extracts pictures from the entire PPT document

If you want to extract pictures from the entire PPT document, you can traversegather. The specific steps are as follows:

  • InitializationPresentationInstance of the class and use()Method loads PPT or PPTX files.
  • useCollections traverse pictures in PPT documents.
    • Extract each image and save it as an image file.

Implement code

from  import *
import os
 
def extract_images_from_presentation(ppt_path, output_folder):
    """Extract pictures from the entire PPT document"""
    presentation = Presentation()
    (ppt_path)
    (output_folder, exist_ok=True)
 
    for i, image in enumerate():
        output_path = (output_folder, f"picture_{i}.png")
        (output_path)
 
    ()
 
#User Exampleextract_images_from_presentation("Test.pptx", "picture")

The above is all about extracting pictures and picture information from PPT using Python.

This is the article about using Python to extract pictures and picture information (such as coordinates, width and height) from PPT documents. For more relevant Python content to extract PPT pictures and picture information, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!