C#: Convert HTML to PDF, XPS and XML

HTML is the standard format for web pages and online content. However, there are many scenarios where you may need to convert HTML documents into other file formats, such as PDF, XPS, and XML. Whether you're looking to generate a printable version of a web page, share HTML content in a more universally accepted format, or extract data from HTML for further processing, being able to reliably convert HTML documents to these alternate formats is an important skill to have. In this article, we will demonstrate how to convert HTML to PDF, XPS, and XML in C# using Spire.Doc for .NET.

Install Spire.Doc for .NET

To begin with, you need to add the DLL files included in the Spire.Doc for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.Doc

Convert HTML to PDF in C#

Converting HTML to PDF offers several advantages, including enhanced portability, consistent formatting, and easy sharing. PDF files retain the original layout, styling, and visual elements of the HTML content, ensuring that the document appears the same across different devices and platforms.

You can use the Document.SaveToFile(string filename, FileFormat.PDF) method to convert an HTML file to PDF format. The detailed steps are as follows.

  • Create an instance of the Document object.
  • Load an HTML file using the Document.LoadFromFile() method.
  • Save the HTML file to PDF format using the Document.SaveToFile(string filename, FileFormat.PDF) method.
  • C#
using Spire.Doc;
using Spire.Doc.Documents;

namespace ConvertHtmlToPdf
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Create an instance of the Document class
            Document doc = new Document();
            // Load an HTML file
            doc.LoadFromFile("Sample.html", FileFormat.Html, XHTMLValidationType.None);

            //Convert the HTML file to PDF format
            doc.SaveToFile("HtmlToPDF.pdf", FileFormat.PDF);
            doc.Close();
        }
    }
}

C#: Convert HTML to PDF, XPS and XML

Convert HTML String to PDF in C#

In addition to converting HTML files to PDF, you are also able to convert HTML strings to PDF. Spire.Doc for .NET provides the Paragraph.AppendHTML() method to add an HTML string to a Word document. Once the HTML string has been added, you can convert the result document to PDF using the Document.SaveToFile(string filename, FileFormat.PDF) method. The detailed steps are as follows.

  • Create an instance of the Document object.
  • Add a paragraph to the document using the Document.AddSection().AddParagraph() method.
  • Append an HTML string to the paragraph using the Paragraph.AppendHTML() method.
  • Save the document to PDF format using the Document.SaveToFile(string filename, FileFormat.PDF) method.
  • C#
using Spire.Doc;
using Spire.Doc.Documents;

namespace ConvertHtmlStringToPdf
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Create an instance of the Document class
            Document doc = new Document();
            // Add a paragraph to the document
            Paragraph para = doc.AddSection().AddParagraph();
            // Specify the HTML string
            string htmlString = @"<h1>This is a Heading</h1>
                                  <p>This is a paragraph.</p>
                                  <ul>
                                    <li>Item 1</li>
                                    <li>Item 2</li>
                                    <li>Item 3</li>
                                  </ul>";

            // Append the HTML string to the paragraph
            para.AppendHTML(htmlString);

            // Convert the document to PDF format
            doc.SaveToFile("HtmlStringToPDF.pdf", FileFormat.PDF);
            doc.Close();
        }
    }
}

C#: Convert HTML to PDF, XPS and XML

Convert HTML to XPS in C#

XPS, or XML Paper Specification, is an alternative format to PDF that provides similar functionality and advantages. Converting HTML to XPS ensures the preservation of document layout, fonts, and images while maintaining high fidelity. XPS files are optimized for printing and can be viewed using XPS viewers or Windows' built-in XPS Viewer.

By using the Document.SaveToFile(string filename, FileFormat.XPS) method, you can convert HTML files to XPS format with ease. The detailed steps are as follows.

  • Create an instance of the Document object.
  • Load an HTML file using the Document.LoadFromFile() method.
  • Save the HTML file to XPS format using the Document.SaveToFile(string filename, FileFormat.XPS) method.
  • C#
using Spire.Doc;
using Spire.Doc.Documents;

namespace ConvertHtmlToXps
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Create an instance of the Document class
            Document doc = new Document();
            // Load an HTML file
            doc.LoadFromFile("Sample.html", FileFormat.Html, XHTMLValidationType.None);

            //Convert the HTML file to XPS format
            doc.SaveToFile("HtmlToXPS.xps", FileFormat.XPS);
            doc.Close();
        }
    }
}

C#: Convert HTML to PDF, XPS and XML

Convert HTML to XML in C#

Converting HTML to XML unlocks the potential for data extraction, manipulation, and integration with other systems. XML is a flexible and extensible markup language that allows for structured representation of data. By converting HTML to XML, you can extract specific elements, organize data hierarchically, and perform data analysis or integration tasks using XML processing tools and techniques.

To convert HTML files to XML format, you can use the Document.SaveToFile(string filename, FileFormat.Xml) method. The detailed steps are as follows.

  • Create an instance of the Document object.
  • Load an HTML file using the Document.LoadFromFile() method.
  • Save the HTML file to XML format using the Document.SaveToFile(string filename, FileFormat.Xml) method.
  • C#
using Spire.Doc;
using Spire.Doc.Documents;

namespace ConvertHtmlToXml
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Create an instance of the Document class
            Document doc = new Document();
            // Load an HTML file
            doc.LoadFromFile("Sample.html", FileFormat.Html, XHTMLValidationType.None);

            //Convert the HTML file to XML format
            doc.SaveToFile("HtmlToXML.xml", FileFormat.Xml);
            doc.Close();
        }
    }
}

C#: Convert HTML to PDF, XPS and XML

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.