Convert Word from/to HTML with Embedded Image
Convert Word document to HTML is popular and widely used by programmers and developers. With the help of Spire.Doc for .NET, a professional word component, without installing MS Word, developers can convert word to html with only two lines of key code in C#. At the same time, Spire.Doc supports convert HTML to word document easily and quickly.
This article still focuses on convert word from/to HTML, while it mainly about the supports of embed image in the word document and HTML. With the improvements of Spire.Doc (starts from Spire.Doc V. 4.9.32), now it supports the new function of ImageEmbedded.
Please download Spire.Doc (version 4.9.32 or above) with .NET framework together and follow the simple steps as below:
Convert Word to HTML in C#:
Step 1: Create the word document.
Document document = new Document();
Step 2: Set the value of imageEmbedded attribute.
doc.HtmlExportOptions.ImageEmbedded=true;
Step 3: Save word document to HTML.
doc.SaveToFile("result.html",FileFormat.Html);
Spire.Doc also supports load the result HTML page and convert it into word document in only three lines of codes as below.
doc.SaveToFile("htmltoword.docx",FileFormat.Docx);
Besides conversion of word from/to HTML, Spire.Doc also supports Convert Word to PDF, Convert Word to Image and Convert Word to XPS in C#.
Convert Multipage Image to PDF in C#
For the function of converting image to PDF, Spire.PDF can handle it quickly and effectively. This .NET PDF library can not only convert images of commonly used formats to PDF document such as jpg, bmp, png, but also convert gif, tif and ico images to PDF. Just download it here.
To convert multipage image to a PDF file with Spire.PDF, just copy the following code to your application and call method ConvertImagetoPDF and you will get it done.
Step 1: Method to split multipage image
Spire.Pdf has a method called DrawImage to convert image to PDF. But it cannot handle multipage image directly. So before conversion, multipage image need to be split into several one-page images.
Guid guid = image.FrameDimensionsList[0]; FrameDimension dimension = new FrameDimension(guid); int pageCount = image.GetFrameCount(dimension);
This step is to get the total number of frames (pages) in the multipage image.
image.SelectActiveFrame(dimension, i);
And this step is to select one frame of frames within this image object.
image.Save(buffer, format);
Save the selected frame to the buffer.
Step 2: Convert image to PDF
After splitting multipage image, Spire.Pdf can draw these split images directly to PDF using method DrawImage.
PdfImage pdfImg = PdfImage.FromImage(img[i])
Load image file as PdfImage.
page.Canvas.DrawImage(pdfImg, x, 0, width, height);
Draw PdfImage to PDF. The only thing to do is to specify the location of image on PDF. Width and height is the size of area that image will be drawn on. Sometimes we need to scale up or down the size of the original size of image until it fit the PDF page. x and 0 locate the coordinate.
Check the effective screenshots for the original TIF file.
The target PDF file:
Full demo:
using Spire.Pdf; using Spire.Pdf.Graphics; using System; using System.Drawing; using System.Drawing.Imaging; using System.IO; namespace ConvertMultipageImagetoPDF { class Program { static void Main(string[] args) { { ConvertImagetoPDF(@"..\..\Chapter1.tif"); } } public static void ConvertImagetoPDF(String ImageFilename) { using (PdfDocument pdfDoc = new PdfDocument()) { Image image = Image.FromFile(ImageFilename); Image[] img = SplitImages(image, ImageFormat.Png); for (int i = 0; i < img.Length; i++) { PdfImage pdfImg = PdfImage.FromImage(img[i]); PdfPageBase page = pdfDoc.Pages.Add(); float width = pdfImg.Width * 0.3f; float height = pdfImg.Height * 0.3f; float x = (page.Canvas.ClientSize.Width - width) / 2; page.Canvas.DrawImage(pdfImg, x, 0, width, height); } string PdfFilename = "result.pdf"; pdfDoc.SaveToFile(PdfFilename); System.Diagnostics.Process.Start(PdfFilename); } } public static Image[] SplitImages(Image image, ImageFormat format) { Guid guid = image.FrameDimensionsList[0]; FrameDimension dimension = new FrameDimension(guid); int pageCount = image.GetFrameCount(dimension); Image[] frames = new Image[pageCount]; for (int i = 0; i < pageCount; i++) { using (MemoryStream buffer = new MemoryStream()) { image.SelectActiveFrame(dimension, i); image.Save(buffer, format); frames[i] = Image.FromStream(buffer); } } return frames; } } }
How to Convert Word to Emf in C#
The article will introduce an easy way to convert Word to Emf by a powerful and independent Word .NET component called Spire.Doc, without Microsoft Word installed on the machine. It also offers support for converting Word and HTML to frequently-used image formats like Jpeg, Png, Gif, Bmp and Tiff, etc. Just click here to have a try.
Emf is a file extension for Enhanced MetaFile, used as a graphics language for printer drivers by the Windows operating system. In 1993, a newer version with additional commands 32-bit version of Win32/GDI introduced the Enhanced Metafile (Emf). Microsoft also recommends enhanced-format (Emf) functions to be used instead of rarely being used Windows-format (WMF) functions.
Spire.Doc presents almost the easiest solution to convert Word to Emf through the following 5 lines simple code.
using Spire.Doc; using System.Drawing.Imaging; namespace DOCEMF { class Program { static void Main(string[] args) { // create an instance of Spire.Doc.Document Document doc = new Document(); // load the file base on a specified file name doc.LoadFromFile(@"../../Original Word.docx", FileFormat.Docx); //convert the first page of document to image System.Drawing.Image image = doc.SaveToImages(0, Spire.Doc.Documents.ImageType.Metafile); // save the document object to Emf file image.Save(@"../../Convert Word to Image.emf", ImageFormat.Emf); //close the document doc.Close(); } } }
Check the effect screenshot below:
Convert HTML to PDF with New Plugin
Converting HTML to PDF with C# PDF component is so important that we always try our best to improve our Spire.PDF better and better. We aim to make it is much more convenient for our developers to use. Now besides the previous method of converting HTML to PDF offered by Spire.PDF, we have a new plugin for html conversion to PDF. This section will focus on the new plugin of convert HTML to PDF. With this new plugin, we support to convert the HTML page with rich elements, such as HTTPS, CSS3, HTML5, JavaScript.
You need to download Spire.PDF and install it on your system, add Spire.PDF.dll as reference in the downloaded Bin folder thought the below path '..\Spire.PDF\Bin\NET4.0\Spire.PDF.dll'. And for gain the new plugin, you could get the new plugin from the download file directly: windows-x86.zip windows-x64.zip macosx_x64.zip linux_x64.tar.gz .
On Windows system, you need to unzip the convertor plugin package and copy the folder 'plugins' under the same folder of Spire.Pdf.dll. Before you use QT plugin for converting HTML to PDF, please ensure you have installed Microsoft Visual C++ 2015 Redistributable on your computer.
On Mac and Linux system, you need to copy the zip file under the system and then unzip the convertor plugin package there to use the plugins successfully.
C# HtmlToPdf.zip and VB.NET HtmlToPdfVB.zip, you could download and try it.
Calling the plugins is very simple, please check the below C# code for convert HTML to PDF.
using System.Drawing; using Spire.Pdf.Graphics; using Spire.Pdf.HtmlConverter.Qt; namespace SPIREPDF_HTMLtoPDF { class Program { static void Main(string[] args) { HtmlConverter.Convert("http://www.wikipedia.org/", "HTMLtoPDF.pdf", //enable javascript true, //load timeout 100 * 1000, //page size new SizeF(612, 792), //page margins new PdfMargins(0, 0)); System.Diagnostics.Process.Start("HTMLtoPDF.pdf"); } } }
Imports System.Drawing Imports Spire.Pdf.Graphics Imports Spire.Pdf.HtmlConverter.Qt Module Module1 Sub Main() HtmlConverter.Convert("http://www.wikipedia.org/", "HTMLtoPDF.pdf", True, 100 * 1000, New SizeF(612, 792), New PdfMargins(0, 0)) System.Diagnostics.Process.Start("HTMLtoPDF.pdf") End Sub End Module
Please check the effective screenshot as below:
The following sample will focus on the new plugin of convert HTML string to PDF.
using System; using System.Collections.Generic; using System.Linq; using System.Text; using Spire.Pdf; using System.IO; using Spire.Pdf.HtmlConverter; using System.Drawing; namespace HTMLToPDFwithPlugins { class Program { static void Main(string[] args) { string input =@"<strong>This is a test for converting HTML string to PDF </strong> <ul><li>Spire.PDF supports to convert HTML in URL into PDF</li> <li>Spire.PDF supports to convert HTML string into PDF</li> <li>With the new plugin</li></ul>"; string outputFile = "ToPDF.pdf"; Spire.Pdf.HtmlConverter.Qt.HtmlConverter.Convert(input, outputFile, //enable javascript true, //load timeout 10 * 1000, //page size new SizeF(612, 792), //page margins new Spire.Pdf.Graphics.PdfMargins(0), //load from content type LoadHtmlType.SourceCode ); System.Diagnostics.Process.Start(outputFile); } } }
Effective screenshot:
How to Export Data into XML in C#?
The aim of the article is to introduce the procedure of exporting data into Office OpenXML in only two steps with a .net component. Spire.DataExport is a completely pure .NET component suit for exporting data into MS Excel, MS Word, HTML, Office OpenXML, PDF, MS Access, DBF, RTF, SQL Script, SYLK, DIF, CSV, MS Clipboard format. It has high performance for exporting data from Command, ListView, DataTable components, which help you to save much time and money.
Please download Spire.DataExport for .NET, add Spire.DataExport.dll as reference and set its target framework as .NET 4. Besides, many developers also check and download another C# excel component together - Spire.XLS for .NET.
Step1: Function to fill data in datatable
In this step, Spire.DataExport will help to load Data information from the datatable. After setting up the data source and SQL command, we can even preview and modify data through DataGridVew before exporting.
private void Form1_Load(object sender, EventArgs e) { oleDbConnection1.ConnectionString = txtConnectString.Text; oleDbCommand1.CommandText = txtCommandText.Text; using (OleDbDataAdapter da = new OleDbDataAdapter()) { da.SelectCommand = oleDbCommand1; da.SelectCommand.Connection = oleDbConnection1; DataTable dt = new DataTable(); da.Fill(dt); dataGridView1.DataSource = dt; } }
Check the Screenshot below:
Step2: Export Data to Office OpenXML
The code below shows how to export data from the datatable to Office OpenXML. Spire.DataExport will create a new Office OpenXML and export data into Office OpenXML through DataGridView. It also allows you to rename the generated Office OpenXML in this step.
private void btnExportToXml_Click(object sender, EventArgs e) { Spire.DataExport.XML.XMLExport xmlExport = new Spire.DataExport.XML.XMLExport(); xmlExport.DataSource = Spire.DataExport.Common.ExportSource.DataTable; xmlExport.DataTable = this.dataGridView1.DataSource as DataTable; xmlExport.ActionAfterExport = Spire.DataExport.Common.ActionType.OpenView; xmlExport.FileName = @"..\..\ToXml.xml"; xmlExport.SaveToFile(); }
Check the Screenshot below:
How to Export Data into MS Access in C#?
This article will show you a clear introduction of how to Export Data to MS Access in C# via a .NET Data Export component. Spire.DataExport for.NET is designed to help developers to perform data exporting processing tasks. With Spire.DataExport, the whole exporting process is quickly and it only needs two simple steps.
Please download Spire.DataExport for .NET and install it on your system, add Spire.DataExport.dll as reference in the downloaded Bin folder thought the below path: “…\Spire.DataExport\Bin\NET4.0\ Spire.DataExport.dll”.
Step 1: Load Data Information
In this step, Spire.DataExport helps us load data from database. Through DataGridVew, we can even preview and modify data.
private void Form1_Load(object sender, EventArgs e) { oleDbConnection1.ConnectionString = txtConnectString.Text; oleDbCommand1.CommandText = txtCommandText.Text; using (OleDbDataAdapter da = new OleDbDataAdapter()) { da.SelectCommand = oleDbCommand1; da.SelectCommand.Connection = oleDbConnection1; DataTable dt = new DataTable(); da.Fill(dt); dataGridView1.DataSource = dt; } }
Please check the screenshot:
Step 2: Set Export into MS Access
Here we need to set it as Access format. Spire.DataExport will create a new Access and through DataGridView export data into Access file. You can rename the file as you like.
private void btnExportToAccess_Click(object sender, EventArgs e) { Spire.DataExport.Access.AccessExport accessExport = new Spire.DataExport.Access.AccessExport(); accessExport.DataSource = Spire.DataExport.Common.ExportSource.DataTable; accessExport.DataTable = this.dataGridView1.DataSource as DataTable; accessExport.DatabaseName = @"..\..\ToMdb.mdb"; accessExport.TableName = "ExportFromDatatable"; accessExport.SaveToFile(); }
Here comes to the results:
How to Export Data from Datatable to Word Document in C#?
There are some requirements to export datatable or dataset to word file on many occasions at work. The aim of this article is to help you complete this requirement. Spire.DataExport can help easily load data from the datatable and create a new Word file for storing the data. In addition to this, Spire.DataExport (or Spire.Office) can export data into MS Excel, HTML, XML, PDF, MS Access, DBF, SQL Script, SYLK, DIF, CSV and MS Clipboard format as well, which can be downloaded here. The following code is the example of showing how Spire.DataExport works.
Step 1: Function to fill data in datatable
In this step, Spire.DataExport will help load Data information from the datatable. After setting up the data source and SQL command, it allows you to preview and edit data in DataGridView component before exporting.
private void Form1_Load(object sender, EventArgs e) { oleDbConnection1.ConnectionString = txtConnectString.Text; oleDbCommand1.CommandText = txtCommandText.Text; using (OleDbDataAdapter da = new OleDbDataAdapter()) { da.SelectCommand = oleDbCommand1; da.SelectCommand.Connection = oleDbConnection1; DataTable dt = new DataTable(); da.Fill(dt); dataGridView1.DataSource = dt; } }
Effect Picture
Step 2: Export Data to word document
The code below shows how to export data from the datatable to Word file. Spire.DataExport will create a new MS Word for storing exported Data. It also allows you to rename the generated Word file in this step.
private void btnExportToWord_Click(object sender, EventArgs e) { Spire.DataExport.RTF.RTFExport rtfExport = new Spire.DataExport.RTF.RTFExport(); rtfExport.DataSource = Spire.DataExport.Common.ExportSource.DataTable; rtfExport.DataTable = this.dataGridView1.DataSource as DataTable; rtfExport.ActionAfterExport = Spire.DataExport.Common.ActionType.OpenView; RTFStyle rtfStyle = new RTFStyle(); rtfStyle.FontColor = Color.Blue; rtfStyle.BackgroundColor = Color.LightGreen; rtfExport.RTFOptions.DataStyle = rtfStyle; rtfExport.FileName=@"..\..\ToWord.doc"; rtfExport.SaveToFile(); }
Effect Picture
C#: Extract Images from PDF Documents
Extracting images from PDFs is a common task for many users, whether it's for repurposing visuals in a presentation, archiving important graphics, or facilitating easier analysis. By mastering image extraction using C#, developers can enhance resource management and streamline their workflow.
In this article, you will learn how to extract images from individual PDF pages as well as from entire documents using C# and Spire.PDF for .NET.
Install Spire.PDF for .NET
To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.PDF
Extract Images from a Specific PDF Page
The PdfImageHelper class in Spire.PDF for .NET is designed to help users manage images within PDF documents. It allows for various operations, such as deleting, replacing, and retrieving images.
To obtain information about the images on a specific PDF page, developers can utilize the PdfImageHelper.GetImagesInfo(PdfPageBase page) method. Once they have the image information, they can save the images to files using the PdfImageInfo.Image.Save() method.
The steps to extract images from a specific PDF page are as follows:
- Create a PdfDocument object.
- Load a PDF file using PdfDocument.LoadFromFile() method.
- Get a specific page using PdfDocument.Pages[index] property.
- Create a PdfImageHelper object.
- Get the image information collection from the page using PdfImageHelper.GetImagesInfo() method.
- Iterate through the image information collection and save each instance as a PNG file using PdfImageInfo.Image.Save() method.
- C#
using Spire.Pdf; using Spire.Pdf.Utilities; using System.Drawing; namespace ExtractImagesFromSpecificPage { class Program { static void Main(string[] args) { // Create a PdfDocument object PdfDocument doc = new PdfDocument(); // Load a PDF document doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf"); // Get a specific page PdfPageBase page = doc.Pages[0]; // Create a PdfImageHelper object PdfImageHelper imageHelper = new PdfImageHelper(); // Get all image information from the page PdfImageInfo[] imageInfos = imageHelper.GetImagesInfo(page); // Iterate through the image information for (int i = 0; i < imageInfos.Length; i++) { // Get a specific image information PdfImageInfo imageInfo = imageInfos[i]; // Get the image Image image = imageInfo.Image; // Save the image to a png file image.Save("C:\\Users\\Administrator\\Desktop\\Extracted\\Image-" + i + ".png"); } // Dispose resources doc.Dispose(); } } }
Extract All Images from an Entire PDF Document
Now that you know how to extract images from a specific page, you can iterate through the pages in a PDF document and extract images from each page. This allows you to collect all the images contained in the document.
The steps to extract all images throughout an entire PDF document are as follows:
- Create a PdfDocument object.
- Load a PDF file using PdfDocument.LoadFromFile() method.
- Create a PdfImageHelper object.
- Iterate through the pages in the document.
- Get a specific page using PdfDocument.Pages[index] property.
- Get the image information collection from the page using PdfImageHelper.GetImagesInfo() method.
- Iterate through the image information collection and save each instance as a PNG file using PdfImageInfo.Image.Save() method.
- C#
using Spire.Pdf; using Spire.Pdf.Utilities; using System.Drawing; namespace ExtractAllImages { class Program { static void Main(string[] args) { // Create a PdfDocument object PdfDocument doc = new PdfDocument(); // Load a PDF document doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf"); // Create a PdfImageHelper object PdfImageHelper imageHelper = new PdfImageHelper(); // Declare an int variable int m = 0; // Iterate through the pages for (int i = 0; i < doc.Pages.Count; i++) { // Get a specific page PdfPageBase page = doc.Pages[i]; // Get all image information from the page PdfImageInfo[] imageInfos = imageHelper.GetImagesInfo(page); // Iterate through the image information for (int j = 0; j < imageInfos.Length; j++) { // Get a specific image information PdfImageInfo imageInfo = imageInfos[j]; // Get the image Image image = imageInfo.Image; // Save the image to a png file image.Save("C:\\Users\\Administrator\\Desktop\\Extracted\\Image-" + m + ".png"); m++; } } // Dispose resources doc.Dispose(); } } }
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
How to convert PDF pages to Tiff image in WPF?
Background
PDF is now widely used to represent document in independent specification. It encapsulates a complete description of a fixed-layout flat document, including the text, fonts and graphics and so on. Due to its powerful functions, it is difficult for developers to parse its format. Or more specifically, to parse content out from PDF document and convert it to different image format is a tough task for some developers. This article will help you solve this problem by using PDF document viewer component Spire.PDFViewer for WPF by 5 easy steps. Firstly, you can download Spire.PDFViewer for WPF.
Target
To convert a specified or random page including frames of images from PDF file to TIFF programmatically.
Step 1: To create WPF application in Visual Studio and reference Spire.PdfViewer.WPF dlls.
Set .NET 4 as target framework
Step 2: Instance an object of Spire.PdfViewer.Wpf.PdfDocumentViewer
PdfDocumentViewer pdfViewer = new PdfDocumentViewer();
Step 3: Call the “LoadFromFile”of PdfDocumentViewer object and load a PDF file.
pdfViewer.LoadFromFile ("sample.pdf");
Step 4: Create an array and save all pages of this PDF file.
int[] pageNumbers=new int[pageCount]; for (int i=0;i
Step 5: Save it to Tiff image format
pdfViewer.SaveAsImage("sample.tiff",pageNumbers);
The following code snippet shows all the code when converting pdf page to tiff image:
private void Button_Click(object sender, RoutedEventArgs e) { // Instance an object of Spire.PdfViewer.Wpf.PdfDocumentViewer PdfDocumentViewer pdfViewer = new PdfDocumentViewer(); //Load a pdf file pdfViewer.LoadFromFile("sample.pdf"); int pageCount = pdfViewer.PageCount; // create an array and save all pages of this PDF file. int[] pageNumbers=new int[pageCount]; for (int i=0;i
Screenshot
Spire.PDFViewer for WPF is a powerful WPF PDF Viewer control which enables developers to display PDF documents with their WPF applications without Adobe Reader. It’s available to load and view PDF documents like PDF/A-1B, PDF/X1A, and even encrypted from stream, file and byte array with support for printing, zooming, etc.
C#/VB.NET: Convert PDF to XPS or XPS to PDF
XPS is a format similar to PDF but uses XML in layout, appearance and printing information of a file. XPS format was developed by Microsoft and it is natively supported by the Windows operating systems. If you want to work with your PDF files on a Windows computer without installing other software, you can convert it to XPS format. Likewise, if you need to share a XPS file with a Mac user or use it on various devices, it is more recommended to convert it to PDF. This article will demonstrate how to programmatically convert PDF to XPS or XPS to PDF using Spire.PDF for .NET.
Install Spire.PDF for .NET
To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.PDF
Convert PDF to XPS in C# and VB.NET
Spire.PDF for .NET supports converting PDF to various file formats, and to achieve the PDF to XPS conversion, you just need three lines of core code. The following are the detailed steps.
- Create a PdfDocument instance.
- Load a sample PDF document using PdfDocument.LoadFromFile() method.
- Convert the PDF document to an XPS file using PdfDocument.SaveToFile (string filename, FileFormat.XPS) method.
- C#
- VB.NET
using Spire.Pdf; namespace ConvertPdfToXps { class Program { static void Main(string[] args) { //Create a PdfDocument instance PdfDocument pdf = new PdfDocument(); //Load sample PDF document pdf.LoadFromFile("sample.pdf"); //Save it to XPS format pdf.SaveToFile("ToXPS.xps", FileFormat.XPS); pdf.Close(); } } }
Convert XPS to PDF in C# and VB.NET
Conversion from XPS to PDF can also be achieved with Spire.PDF for .NET. While converting, you can set to keep high quality image on the generated PDF file by using the PdfDocument.ConvertOptions.SetXpsToPdfOptions() method. The following are the detailed steps.
- Create a PdfDocument instance.
- Load an XPS file using PdfDocument.LoadFromFile(string filename, FileFormat.XPS) method or PdfDocument.LoadFromXPS() method.
- While conversion, set the XPS to PDF convert options to keep high quality images using PdfDocument.ConvertOptions.SetXpsToPdfOptions() method.
- Save the XPS file to a PDF file using PdfDocument.SaveToFile(string filename, FileFormat.PDF) method.
- C#
- VB.NET
using Spire.Pdf; namespace ConvertXPStoPDF { class Program { static void Main(string[] args) { //Create a PdfDocument instance PdfDocument pdf = new PdfDocument(); //Load a sample XPS file pdf.LoadFromFile("Sample.xps", FileFormat.XPS); //pdf.LoadFromXPS("Sample.xps"); //Keep high quality images when converting XPS to PDF pdf.ConvertOptions.SetXpsToPdfOptions(true); //Save the XPS file to PDF pdf.SaveToFile("XPStoPDF.pdf", FileFormat.PDF); } } }
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.