Python Merge PDF Files with Simple Code

2024-01-05 08:56:05

Merging PDF is the integration of multiple PDF files into a single PDF file. It allows users to combine the contents of multiple related PDF files into a single PDF file to better categorize, manage, and share files. For example, before sharing a document, similar documents can be merged into one file to simplify the sharing process. This post will show you how to use Python to merge PDF files with simple code.

Python Library for Merging PDF Files

Spire.PDF for Python is a powerful Python library for creating and manipulating PDF files. With it, you are also able to use Python to merge PDF files effortlessly.  Before that, we need to install Spire.PDF for Python and plum-dispatch v1.7.4, which can be easily installed in VS Code using the following pip commands.

pip install Spire.PDF

This article covers more details of the installation: How to Install Spire.PDF for Python in VS Code

Merge PDF Files in Python

This method supports directly merging multiple PDF files into a single file.

Steps

  • Import the required library modules.
  • Create a list containing the paths of PDF files to be merged.
  • Use the Document.MergeFiles(inputFiles: List[str]) method to merge these PDFs into a single PDF.
  • Call the PdfDocumentBase.Save(filename: str, FileFormat.PDF) method to save the merged file in PDF format to the specified output path and release resources.

Sample Code

  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a list of the PDF file paths
inputFile1 = "C:/Users/Administrator/Desktop/PDFs/Sample-1.pdf"
inputFile2 = "C:/Users/Administrator/Desktop/PDFs/Sample-2.pdf"
inputFile3 = "C:/Users/Administrator/Desktop/PDFs/Sample-3.pdf"
files = [inputFile1, inputFile2, inputFile3]

# Merge the PDF documents
pdf = PdfDocument.MergeFiles(files)

# Save the result document
pdf.Save("C:/Users/Administrator/Desktop/MergePDF.pdf", FileFormat.PDF)
pdf.Close()

Python Merge PDF Files with Simple Code

Merge PDF Files by Cloning Pages in Python

Unlike the above method, this method merges multiple PDF files by copying document pages and inserting them into a new file.

Steps

  • Import the required library modules.
  • Create a list containing the paths of PDF files to be merged.
  • Loop through each file in the list and load it as a PdfDocument object; then add them to a new list.
  • Create a new PdfDocument object as the destination file.
  • Iterate through the PdfDocument objects in the list and append their pages to the new PdfDocument object.
  • Finally, call the PdfDocument.SaveToFile() method to save the new PdfDocument object to the specified output path.

Sample Code

  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create a list of the PDF file paths
file1 = "C:/Users/Administrator/Desktop/PDFs/Sample-1.pdf"
file2 = "C:/Users/Administrator/Desktop/PDFs/Sample-2.pdf"
file3 = "C:/Users/Administrator/Desktop/PDFs/Sample-3.pdf"
files = [file1, file2, file3]

# Load each PDF file as an PdfDocument object and add them to a list
pdfs = []
for file in files:
    pdfs.append(PdfDocument(file))

# Create an object of PdfDocument class
newPdf = PdfDocument()

# Insert the pages of the loaded PDF documents into the new PDF document
for pdf in pdfs:
    newPdf.AppendPage(pdf)

# Save the new PDF document    
newPdf.SaveToFile("C:/Users/Administrator/Desktop/ClonePage.pdf")

Python Merge PDF Files with Simple Code

Merge Selected Pages of PDF Files in Python

This method is similar to merging PDFs by cloning pages, and you can specify the desired pages when merging.

Steps

  • Import the required library modules.
  • Create a list containing the paths of PDF files to be merged.
  • Loop through each file in the list and load it as a PdfDocument object; then add them to a new list.
  • Create a new PdfDocument object as the destination file.
  • Insert the selected pages from the loaded files into the new PdfDocument object using PdfDocument.InsertPage(PdfDocument, pageIndex: int) method or PdfDocument.InsertPageRange(PdfDocument, startIndex: int, endIndex: int) method.
  • Finally, call the PdfDocument.SaveToFile() method to save the new PdfDocument object to the specified output path.

Sample Code

  • Python
from spire.pdf import *
from spire.pdf.common import *

# Create a list of the PDF file paths
file1 = "C:/Users/Administrator/Desktop/PDFs/Sample-1.pdf"
file2 = "C:/Users/Administrator/Desktop/PDFs/Sample-2.pdf"
file3 = "C:/Users/Administrator/Desktop/PDFs/Sample-3.pdf"
files = [file1, file2, file3]

# Load each PDF file as an PdfDocument object and add them to a list
pdfs = []
for file in files:
    pdfs.append(PdfDocument(file))

# Create an object of PdfDocument class
newPdf = PdfDocument()

# Insert the selected pages from the loaded PDF documents into the new document
newPdf.InsertPage(pdfs[0], 0)
newPdf.InsertPage(pdfs[1], 1)
newPdf.InsertPageRange(pdfs[2], 0, 1)

# Save the new PDF document
newPdf.SaveToFile("C:/Users/Administrator/Desktop/SelectedPages.pdf")

Python Merge PDF Files with Simple Code

Get a Free License for the Library to Merge PDF in Python

You can get a free 30-day temporary license of Spire.PDF for Python to merge PDF files in Python without evaluation limitations.

Conclusion

In this article, you have learned how to merge PDF files in Python. Spire.PDF for Python provides two different ways to merge multiple PDF files, including merging files directly and copying pages. Also, you can merge selected pages of multiple PDF files based on the second method. In a word, this library simplifies the process and allows developers to focus on building powerful applications that involve PDF manipulation tasks.

See Also