HTML is the standard markup language for creating web pages, while PDF is a widely accepted format for sharing and preserving documents with consistent formatting across different platforms. By converting HTML to PDF, you can create printable documents, share web content offline, or generate reports with ease. In this article, you will learn how to convert a HTML file or a HTML string to PDF in Python using Spire.Doc for Python.
Install Spire.Doc for Python
This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.Doc
If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows
Convert an HTML File to PDF in Python
The Document.LoadFromFile method supports loading not only Doc or Docx files, but also HTML files. You can load an HTML file using this method and save it as a PDF file using the Document.SaveToFile() method. The following are the steps to convert an HTML file to PDF in Python.
- Create a Document object.
- Load an HTML file using Document.LoadFromFile() method.
- Convert it to PDF using Document.SaveToFile() method.
- Python
from spire.doc import * from spire.doc.common import * # Create a Document object document = Document() # Load an HTML file document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Sample.html", FileFormat.Html, XHTMLValidationType.none) # Save the HTML file to a pdf file document.SaveToFile("output/ToPdf.pdf", FileFormat.PDF) document.Close()
Convert an HTML String to PDF in Python
To render uncomplicated HTML strings (usually text and its formatting) on Word pages, you can use the Paragraph.AppendHTML() method. The Word document can be then saved as a PDF file using the Document.SaveToFile() method. The following are the steps to convert an HTML string to PDF in Python.
- Create a Document object.
- Add a section using Document.AddSection() method.
- Add a paragraph using Section.AddParagraph() method.
- Specify the HTML string.
- Add the HTML string to the paragraph using Paragraph.AppendHTML() method.
- Save the document as a PDF file using Document.SaveToFile() method.
- Python
from spire.doc import * from spire.doc.common import * # Create a Document object document = Document() # Add a section to the document sec = document.AddSection() # Add a paragraph to the section paragraph = sec.AddParagraph() # Specify the HTML string htmlString = """ <html> <head> <title>HTML to Word Example</title> <style> body { font-family: Arial, sans-serif; } h1 { color: #FF5733; font-size: 24px; margin-bottom: 20px; } p { color: #333333; font-size: 16px; margin-bottom: 10px; } ul { list-style-type: disc; margin-left: 20px; margin-bottom: 15px; } li { font-size: 14px; margin-bottom: 5px; } table { border-collapse: collapse; width: 100%; margin-bottom: 20px; } th, td { border: 1px solid #CCCCCC; padding: 8px; text-align: left; } th { background-color: #F2F2F2; font-weight: bold; } td { color: #0000FF; } </style> </head> <body> <h1>This is a Heading</h1> <p>This is a paragraph.</p> <p>Here's an unordered list:</p> <ul> <li>Item 1</li> <li>Item 2</li> <li>Item 3</li> </ul> <p>And here's a table:</p> <table> <tr> <th>Name</th> <th>Age</th> <th>Gender</th> </tr> <tr> <td>John Smith</td> <td>35</td> <td>Male</td> </tr> <tr> <td>Jenny Garcia</td> <td>27</td> <td>Female</td> </tr> </table> </body> </html> """ # Append the HTML string to the paragraph paragraph.AppendHTML(htmlString) # Save the document as a pdf file document.SaveToFile("output/HtmlStringToPdf.pdf", FileFormat.PDF) document.Close()
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.