object has no attribute 'ExtractImages'

Fri Jul 12, 2024 11:22 am

I installed Spire.Pdf 10.7.1 for Python on colab with

Code: Select all: !pip install Spire.PDF

Next I tried to run the sample code under "Extract All the Images from a PDF Document " in this tutoral:

Python-Extract-Text-and-Images-from-PDF-Documents.html

I get the following error on the inner loop:

AttributeError: 'PdfPageBase' object has no attribute 'ExtractImages'

Any suggestions?

Mon Jul 15, 2024 3:44 am

Hello,

Thanks for your inquiry.
Sorry, the way to extract images on the website is outdated. We will update it as soon as possible. Please use the latest code below for testing. If you have any other questions, please feel free to write back.

Code: Select all: from spire.pdf.common import * from spire.pdf import * # Create a PdfDocument object doc = PdfDocument() # Load the PDF document doc.LoadFromFile("in.pdf") # Create a PdfImageHelper object image_helper = PdfImageHelper() image_count = 1 # Iterate through the pages in the document for i in range(doc.Pages.Count): # Get the image information from the current page images_info = image_helper.GetImagesInfo(doc.Pages[i]) # Get the images and save them as image files for j in range(len(images_info)): image_info = images_info[j] output_file = f"image{image_count}.png" image_info.Image.Save(output_file) image_count += 1 doc.Close()

Sincerely,
William
E-iceblue support team