Python: Convert HTML to PDF

HTML is the standard markup language for creating web pages, while PDF is a widely accepted format for sharing and preserving documents with consistent formatting across different platforms. By converting HTML to PDF, you can create printable documents, share web content offline, or generate reports with ease. In this article, you will learn how to convert a HTML file or a HTML string to PDF in Python using Spire.Doc for Python.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.Doc

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows

Convert an HTML File to PDF in Python

The Document.LoadFromFile method supports loading not only Doc or Docx files, but also HTML files. You can load an HTML file using this method and save it as a PDF file using the Document.SaveToFile() method. The following are the steps to convert an HTML file to PDF in Python.

  • Create a Document object.
  • Load an HTML file using Document.LoadFromFile() method.
  • Convert it to PDF using Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a Document object
document = Document()

# Load an HTML file 
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Sample.html", FileFormat.Html, XHTMLValidationType.none)

# Save the HTML file to a pdf file
document.SaveToFile("output/ToPdf.pdf", FileFormat.PDF)
document.Close()

Python: Convert HTML to PDF

Convert an HTML String to PDF in Python

To render uncomplicated HTML strings (usually text and its formatting) on Word pages, you can use the Paragraph.AppendHTML() method. The Word document can be then saved as a PDF file using the Document.SaveToFile() method. The following are the steps to convert an HTML string to PDF in Python.

  • Create a Document object.
  • Add a section using Document.AddSection() method.
  • Add a paragraph using Section.AddParagraph() method.
  • Specify the HTML string.
  • Add the HTML string to the paragraph using Paragraph.AppendHTML() method.
  • Save the document as a PDF file using Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a Document object
document = Document()

# Add a section to the document
sec = document.AddSection()

# Add a paragraph to the section
paragraph = sec.AddParagraph()

# Specify the HTML string
htmlString = """
<html>
<head>
    <title>HTML to Word Example</title>
    <style>
        body {
            font-family: Arial, sans-serif;
        }
        h1 {
            color: #FF5733;
            font-size: 24px;
            margin-bottom: 20px;
        }
        p {
            color: #333333;
            font-size: 16px;
            margin-bottom: 10px;
        }
        ul {
            list-style-type: disc;
            margin-left: 20px;
            margin-bottom: 15px;
        }
        li {
            font-size: 14px;
            margin-bottom: 5px;
        }
        table {
            border-collapse: collapse;
            width: 100%;
            margin-bottom: 20px;
        }
        th, td {
            border: 1px solid #CCCCCC;
            padding: 8px;
            text-align: left;
        }
        th {
            background-color: #F2F2F2;
            font-weight: bold;
        }
        td {
            color: #0000FF;
        }
    </style>
</head>
<body>
    <h1>This is a Heading</h1>
    <p>This is a paragraph.</p>
    <p>Here's an unordered list:</p>
    <ul>
        <li>Item 1</li>
        <li>Item 2</li>
        <li>Item 3</li>
    </ul>
    <p>And here's a table:</p>
    <table>
        <tr>
            <th>Name</th>
            <th>Age</th>
            <th>Gender</th>
        </tr>
        <tr>
            <td>John Smith</td>
            <td>35</td>
            <td>Male</td>
        </tr>
        <tr>
            <td>Jenny Garcia</td>
            <td>27</td>
            <td>Female</td>
        </tr>
    </table>
</body>
</html>
"""

# Append the HTML string to the paragraph
paragraph.AppendHTML(htmlString)

# Save the document as a pdf file
document.SaveToFile("output/HtmlStringToPdf.pdf", FileFormat.PDF)
document.Close()

Python: Convert HTML to PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.