Spire.PDF is a professional PDF library applied to creating, writing, editing, handling and reading PDF files without any external dependencies. Get free and professional technical support for Spire.PDF for .NET, Java, Android, C++, Python.

Fri Jul 12, 2024 11:22 am

I installed Spire.Pdf 10.7.1 for Python on colab with
Code: Select all
!pip install Spire.PDF


Next I tried to run the sample code under "Extract All the Images from a PDF Document " in this tutoral:

Python-Extract-Text-and-Images-from-PDF-Documents.html

I get the following error on the inner loop:

AttributeError: 'PdfPageBase' object has no attribute 'ExtractImages'


Any suggestions?

stanvandeburgt
 
Posts: 1
Joined: Fri Jul 12, 2024 11:12 am

Mon Jul 15, 2024 3:44 am

Hello,

Thanks for your inquiry.
Sorry, the way to extract images on the website is outdated. We will update it as soon as possible. Please use the latest code below for testing. If you have any other questions, please feel free to write back.
Code: Select all
from spire.pdf.common import *
from spire.pdf import *

# Create a PdfDocument object
doc = PdfDocument()
# Load the PDF document
doc.LoadFromFile("in.pdf")
# Create a PdfImageHelper object
image_helper = PdfImageHelper()
image_count = 1
# Iterate through the pages in the document
for i in range(doc.Pages.Count):
    # Get the image information from the current page
    images_info = image_helper.GetImagesInfo(doc.Pages[i])
    # Get the images and save them as image files
    for j in range(len(images_info)):
        image_info = images_info[j]
        output_file = f"image{image_count}.png"
        image_info.Image.Save(output_file)
        image_count += 1
doc.Close()

Sincerely,
William
E-iceblue support team
User avatar

William.Zhang
 
Posts: 419
Joined: Mon Dec 27, 2021 2:23 am

Return to Spire.PDF