Python: Convert PDF to PDF/A and Vice Versa

PDF/A is a specialized format designed specifically for long-term archiving and preservation of electronic documents. It guarantees that the content, structure, and visual appearance of the documents remain unchanged over time. By converting PDF files to PDF/A format, you ensure the long-term accessibility of the documents, regardless of software, operating systems, or future technological advancements. Conversely, converting PDF/A files to standard PDF format makes it easier to edit, share, and collaborate on the documents, ensuring better compatibility across different applications, devices, and platforms. In this article, we will explain how to convert PDF to PDF/A and vice versa in Python using Spire.PDF for Python.

Install Spire.PDF for Python

This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.PDF

If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows

Convert PDF to PDF/A in Python

The PdfStandardsConverter class provided by Spire.PDF for Python supports converting PDF to various PDF/A formats, including PDF/A-1a, 2a, 3a, 1b, 2b and 3b. Moreover, it also supports converting PDF to PDF/X-1a:2001. The detailed steps are as follows.

  • Specify the input file path and output folder.
  • Create a PdfStandardsConverter object and pass the input file path to the constructor of the class as a parameter.
  • Convert the input file to a Pdf/A-1a conformance file using PdfStandardsConverter.ToPdfA1A() method.
  • Convert the input file to a Pdf/A-1b file using PdfStandardsConverter.ToPdfA1B() method.
  • Convert the input file to a Pdf/A-2a file using PdfStandardsConverter.ToPdfA2A() method.
  • Convert the input file to a Pdf/A-2b file using PdfStandardsConverter.ToPdfA2B() method.
  • Convert the input file to a Pdf/A-3a file using PdfStandardsConverter.ToPdfA3A() method.
  • Convert the input file to a Pdf/A-3b file using PdfStandardsConverter.ToPdfA3B() method.
  • Convert the input file to a PDF/X-1a:2001 file using PdfStandardsConverter.ToPdfX1A2001() method.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Specify the input file path and output folder
inputFile = "Sample.pdf"
outputFolder = "Output/"

# Create an object of the PdfStandardsConverter class
converter = PdfStandardsConverter(inputFile)

# Convert the input file to PdfA1A
converter.ToPdfA1A(outputFolder + "ToPdfA1A.pdf")

# Convert the input file to PdfA1B
converter.ToPdfA1B(outputFolder + "ToPdfA1B.pdf")

# Convert the input file to PdfA2A
converter.ToPdfA2A(outputFolder + "ToPdfA2A.pdf")

# Convert the input file to PdfA2B
converter.ToPdfA2B(outputFolder + "ToPdfA2B.pdf")

# Convert the input file to PdfA3A
converter.ToPdfA3A(outputFolder + "ToPdfA3A.pdf")

# Convert the input file to PdfA3B
converter.ToPdfA3B(outputFolder + "ToPdfA3B.pdf")

# Convert the input file to PDF/X-1a:2001
converter.ToPdfX1A2001(outputFolder + "ToPdfX1a.pdf")

Python: Convert PDF to PDF/A and Vice Versa

Convert PDF/A to PDF in Python

To convert a PDF/A file back to a standard PDF format, you need to create a new standard PDF file, and then draw the page content of the PDF/A file to the newly created PDF file. The detailed steps are as follows.

  • Create a PdfDocument object.
  • Load a PDF/A file using PdfDocument.LoadFromFile() method.
  • Create a PdfNewDocument object and set its compression level as none.
  • Loop through the pages in the original PDF/A file.
  • Add pages to the newly created PDF using PdfDocumentBase.Pages.Add() method.
  • Draw the page content of the original PDF/A file to the corresponding pages of the newly created PDF using PdfPageBase.CreateTemplate.Draw() method.
  • Create a Stream object and then save the new PDF to the stream using PdfNewDocument.Save() method.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Specify the input and output file paths
inputFile = "Output/ToPdfA1A.pdf"
outputFile = "PdfAToPdf.pdf"

# Create an object of the PdfDocument class
doc = PdfDocument()
# Load a PDF file
doc.LoadFromFile(inputFile)

# Create a new standard PDF file
newDoc = PdfNewDocument()
newDoc.CompressionLevel = PdfCompressionLevel.none

# Add pages to the newly created PDF and draw the page content of the loaded PDF onto the corresponding pages of the newly created PDF
for i in range(doc.Pages.Count):
    page = doc.Pages.get_Item(i)
    size = page.Size
    p = newDoc.Pages.Add(size, PdfMargins(0.0))
    page.CreateTemplate().Draw(p, 0.0, 0.0)   

# Save the new PDF to a PDF file   
fileStream = Stream(outputFile)
newDoc.Save(fileStream)
fileStream.Close()
newDoc.Close(True)

Python: Convert PDF to PDF/A and Vice Versa

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.