Monday, 30 September 2024 01:13

C#: Convert PDF to Markdown

The need to convert PDF documents into more flexible and editable formats, such as Markdown, has become a common task for developers and content creators. Converting PDFs to Markdown files facilitates easier editing and version control, and enhances content portability across different platforms and applications, making it particularly suitable for modern web publishing workflows. By utilizing Spire.PDF for .NET, developers can automate the conversion process, ensuring that the rich formatting and structure of the original PDFs are preserved in the resulting Markdown files.

This article will demonstrate how to use Spire.PDF for .NET to convert PDF documents to Markdown format with C# code.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.PDF

Convert PDF Documents to Markdown Files

With the Spire.PDF for .NET library, developers can easily load any PDF file using the PdfDocument.LoadFromFile(string filename) method and then save the document in the desired format by calling the PdfDocument.SaveToFile(string filename, FileFormat fileFormat) method. To convert a PDF to Markdown format, simply specify the FileFormat.Markdown enumeration as a parameter when invoking the method.

The detailed steps for converting PDF documents to Markdown files are as follows:

  • Create an instance of PdfDocument class.
  • Load a PDF document using PdfDocument.LoadFromFile(string filename) method.
  • Convert the document to a Markdown file using PdfDocument.SaveToFile(string filename, FileFormat.Markdown) method.
  • C#
using Spire.Pdf;

namespace PDFToMarkdown
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create an instance of PdfDocument class
            PdfDocument pdf = new PdfDocument();

            // Load a PDF document
            pdf.LoadFromFile("Sample.pdf");

            // Convert the document to Markdown file
            pdf.SaveToFile("output/PDFToMarkdown.md", FileFormat.Markdown);

            // Release resources
            pdf.Close();
        }
    }
}

The PDF Document:

C#: Convert PDF to Markdown

The Result Markdown File:

C#: Convert PDF to Markdown

Convert PDF to Markdown by Streams

In addition to directly reading files for manipulation, Spire.PDF for .NET also supports loading a PDF document from a stream using PdfDocument.LoadFromStream() method and converting it to a Markdown file stream using PdfDocument.SaveToStream() method. Using streams reduces memory usage, supports large files, enables real-time data transfer, and simplifies data exchange with other systems.

The detailed steps for converting PDF documents to Markdown files by streams are as follows:

  • Create a Stream object of PDF documents by downloading from the web or reading from a file.
  • Load the PDF document from the stream using PdfDocument.LoadFromStream(Stream stream) method.
  • Create another Stream object to store the converted Markdown file.
  • Convert the PDF document to a Markdown file stream using PdfDocument.SaveToStream(Stream stream, FileFormat.Markdown) method.
  • C#
using Spire.Pdf;
using System.IO;
using System.Net.Http;

namespace PDFToMarkdownByStream
{
    class Program
    {
        static async Task Main(string[] args)
        {
            // Create an instance of PdfDocument class
            PdfDocument pdf = new PdfDocument();

            // Download a PDF document from a url as bytes
            using (HttpClient client = new HttpClient())
            {
                byte[] pdfBytes = await client.GetByteArrayAsync("http://example.com/Sample.pdf");

                // Create a MemoryStream using the bytes
                using (MemoryStream inputStream = new MemoryStream(pdfBytes))
                {
                    // Load the PDF document from the stream
                    pdf.LoadFromStream(inputStream);

                    // Create another MemoryStream object to store the Markdown file
                    using (MemoryStream outputStream = new MemoryStream())
                    {
                        // Convert the PDF document to a Markdown file stream
                        pdf.SaveToStream(outputStream, FileFormat.Markdown);
                        outputStream.Position = 0; // Reset the position of the stream for subsequent reads

                        // Upload the result stream or write it to a file
                        await client.PostAsync("http://example.com/upload", new StreamContent(outputStream));
                        File.WriteAllBytes("output.md", outputStream.ToArray());
                    }
                }
            }

            // Release resources
            pdf.Close();
        }
    }
}

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Conversion

If you have multiple images that you want to combine into one file for easier distribution or storage, converting them into a single PDF document is a great solution. This process not only saves space but also ensures that all your images are kept together in one file, making it convenient to share or transfer. In this article, you will learn how to combine several images into a single PDF document in C# and VB.NET using Spire.PDF for .NET.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.PDF

Combine Multiple Images into a Single PDF in C# and VB.NET

In order to convert all the images in a folder to a PDF, we iterate through each image, add a new page to the PDF with the same size as the image, and then draw the image onto the new page. The following are the detailed steps.

  • Create a PdfDocument object.
  • Set the page margins to zero using PdfDocument.PageSettings.SetMargins() method.
  • Get the folder where the images are stored.
  • Iterate through each image file in the folder, and get the width and height of a specific image.
  • Add a new page that has the same width and height as the image to the PDF document using PdfDocument.Pages.Add() method.
  • Draw the image on the page using PdfPageBase.Canvas.DrawImage() method.
  • Save the document using PdfDocument.SaveToFile() method.
  • C#
  • VB.NET
using Spire.Pdf;
using Spire.Pdf.Graphics;
using System.Drawing;

namespace ConvertMultipleImagesIntoPdf
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument object
            PdfDocument doc = new PdfDocument();

            //Set the page margins to 0
            doc.PageSettings.SetMargins(0);

            //Get the folder where the images are stored
            DirectoryInfo folder = new DirectoryInfo(@"C:\Users\Administrator\Desktop\Images");

            //Iterate through the files in the folder
            foreach (FileInfo file in folder.GetFiles())
            {
                //Load a particular image 
                Image image = Image.FromFile(file.FullName);

                //Get the image width and height
                float width = image.PhysicalDimension.Width;
                float height = image.PhysicalDimension.Height;

                //Add a page that has the same size as the image
                PdfPageBase page = doc.Pages.Add(new SizeF(width, height));

                //Create a PdfImage object based on the image
                PdfImage pdfImage = PdfImage.FromImage(image);

                //Draw image at (0, 0) of the page
                page.Canvas.DrawImage(pdfImage, 0, 0, pdfImage.Width, pdfImage.Height);
            }
      
            //Save to file
            doc.SaveToFile("CombinaImagesToPdf.pdf");
            doc.Dispose();
        }
    }
}

C#/VB.NET: Convert Multiple Images into a Single PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Conversion
Wednesday, 11 January 2023 02:09

C#/VB.NET: Convert PDF to PowerPoint

PDF files are great for presenting on different types of devices and sharing across platforms, but it has to admit that editing PDF is a bit challenging. When you receive a PDF file and need to prepare a presentation based on the content inside, it is recommended to convert the PDF file to a PowerPoint document to have a better presentation effect and also to ensure the content can be further edited. This article will demonstrate how to programmatically convert PDF to PowerPoint presentation using Spire.PDF for .NET.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.PDF

Convert PDF to PowerPoint Presentation in C# and VB.NET

From Version 8.11.10, Spire.PDF for .NET supports converting PDF to PPTX using PdfDocument.SaveToFile() method. With this method, each page of your PDF file will be converted to a single slide in PowerPoint. Below are the steps to convert a PDF file to an editable PowerPoint document.

  • Create a PdfDocument instance.
  • Load a sample PDF document using PdfDocument.LoadFromFile() method.
  • Save the document as a PowerPoint document using PdfDocument.SaveToFile(string filename, FileFormat.PPTX) method.
  • C#
  • VB.NET
using Spire.Pdf;

namespace PDFtoPowerPoint
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument instance
            PdfDocument pdf = new PdfDocument();

            //Load a sample PDF document
            pdf.LoadFromFile(@"C:\Users\Administrator\Desktop\Sample.pdf");

            //Convert the PDF to PPTX document
            pdf.SaveToFile("ConvertPDFtoPowerPoint.pptx", FileFormat.PPTX);
        }
    }
}

C#/VB.NET: Convert PDF to PowerPoint

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Conversion
Friday, 10 December 2021 03:38

C#/VB.NET: Convert PDF to Linearized

PDF linearization, also known as "Fast Web View", is a way of optimizing PDF files. Ordinarily, users can view a multipage PDF file online only when their web browsers have downloaded all pages from the server. However, if the PDF file is linearized, the browsers can display the first page very quickly even if the full download has not been completed. This article will demonstrate how to convert a PDF to linearized in C# and VB.NET using Spire.PDF for .NET.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

  • Package Manager
PM> Install-Package Spire.PDF

Convert PDF to Linearized

The following are the steps to convert a PDF file to linearized:

  • Load a PDF file using PdfToLinearizedPdfConverter class.
  • Convert the file to linearized using PdfToLinearizedPdfConverter.ToLinearizedPdf() method.
  • C#
  • VB.NET
using Spire.Pdf.Conversion;

namespace ConvertPdfToLinearized
{
    class Program
    {
        static void Main(string[] args)
        {
            //Load a PDF file
            PdfToLinearizedPdfConverter converter = new PdfToLinearizedPdfConverter("Sample.pdf");
            //Convert the file to a linearized PDF
            converter.ToLinearizedPdf("Linearized.pdf");
        }
    }
}
Imports Spire.Pdf.Conversion

Namespace ConvertPdfToLinearized
    Friend Class Program
        Private Shared Sub Main(ByVal args As String())
            'Load a PDF file
            Dim converter As PdfToLinearizedPdfConverter = New PdfToLinearizedPdfConverter("Sample.pdf")
            'Convert the file to a linearized PDF
            converter.ToLinearizedPdf("Linearized.pdf")
        End Sub
    End Class
End Namespace

Open the result file in Adobe Acrobat and take a look at the document properties, you can see the value of “Fast Web View” is Yes which means the file is linearized.

C#/VB.NET: Convert PDF to Linearized

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Conversion

Converting a PDF with color images to grayscale can help you reduce the file size and print the PDF in a more affordable mode without consuming colored ink. In this article, you will learn how to achieve the conversion programmatically in C# and VB.NET using Spire.PDF for .NET.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

  • Package Manager
PM> Install-Package Spire.PDF 

Convert PDF to Grayscale

The following are the steps to convert a color PDF to grayscale:

  • C#
  • VB.NET
using Spire.Pdf.Conversion;
 
namespace ConvertPdfToGrayscale
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfGrayConverter instance and load a PDF file
            PdfGrayConverter converter = new PdfGrayConverter(@"Sample.pdf");
            //Convert the PDF to grayscale
            converter.ToGrayPdf("Grayscale.pdf");
            converter.Dispose();
        }
    }
}
Imports Spire.Pdf.Conversion

Namespace ConvertPdfToGrayscale
    Friend Class Program
        Private Shared Sub Main(ByVal args As String())
            'Create a PdfGrayConverter instance and load a PDF file
            Dim converter As PdfGrayConverter = New PdfGrayConverter("Sample.pdf")
            'Convert the PDF to grayscale
            converter.ToGrayPdf("Grayscale.pdf")
            converter.Dispose()
        End Sub
    End Class
End Namespace

The input PDF:

C#/VB.NET: Convert PDF to Grayscale (Black and White)

The output PDF:

C#/VB.NET: Convert PDF to Grayscale (Black and White)

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Conversion
Thursday, 09 June 2022 03:58

C#/VB.NET: Convert PDF to Excel

PDF is a versatile file format, but it is difficult to edit. If you want to modify and calculate PDF data, converting PDF to Excel would be an ideal solution. In this article, you will learn how to convert PDF to Excel in C# and VB.NET using Spire.PDF for .NET.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.PDF

Convert PDF to Excel in C# and VB.NET

The following are the steps to convert a PDF document to Excel:

  • Initialize an instance of PdfDocument class.
  • Load the PDF document using PdfDocument.LoadFromFile(filePath) method.
  • Save the document to Excel using PdfDocument.SaveToFile(filePath, FileFormat.XLSX) method.
  • C#
  • VB.NET
using Spire.Pdf;
using Spire.Pdf.Conversion;

namespace ConvertPdfToExcel
{
    class Program
    {
        static void Main(string[] args)
        {
            //Initialize an instance of PdfDocument class
            PdfDocument pdf = new PdfDocument();
            //Load the PDF document
            pdf.LoadFromFile("Sample.pdf");

            //Save the PDF document to XLSX
            pdf.SaveToFile("PdfToExcel.xlsx", FileFormat.XLSX);
        }
    }
}

C#/VB.NET: Convert PDF to Excel

Convert a Multi-Page PDF to One Excel Worksheet in C# and VB.NET

The following are the steps to covert a multi-page PDF to one Excel worksheet:

  • Initialize an instance of PdfDocument class.
  • Load the PDF document using PdfDocument.LoadFromFile(filePath) method.
  • Initialize an instance of XlsxLineLayoutOptions class, in the class constructor, setting the first parameter - convertToMultipleSheet as false.
  • Set PDF to XLSX convert options using PdfDocument.ConvertOptions.SetPdfToXlsxOptions(XlsxLineLayoutOptions) method.
  • Save the document to Excel using PdfDocument.SaveToFile(filePath, FileFormat.XLSX) method.
  • C#
  • VB.NET
using Spire.Pdf;
using Spire.Pdf.Conversion;

namespace ConvertPdfToExcel
{
    class Program
    {
        static void Main(string[] args)
        {
            //Initialize an instance of PdfDocument class
            PdfDocument pdf = new PdfDocument();
            //Load the PDF document
            pdf.LoadFromFile("Sample1.pdf");

            //Initialize an instance of XlsxLineLayoutOptions class, in the class constructor, setting the first parameter - convertToMultipleSheet as false.
            //The four parameters represent: convertToMultipleSheet, showRotatedText, splitCell, wrapText
            XlsxLineLayoutOptions options = new XlsxLineLayoutOptions(false, true, true, true);
            //Set PDF to XLSX convert options
            pdf.ConvertOptions.SetPdfToXlsxOptions(options);

            //Save the PDF document to XLSX
            pdf.SaveToFile("PdfToOneExcelSheet.xlsx", FileFormat.XLSX);
        }
    }
}

C#/VB.NET: Convert PDF to Excel

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Conversion
Friday, 16 June 2023 08:10

C#/VB.NET: Convert SVG to PDF

SVG is a file format for vector graphics, used to create images that can be scaled without loss of quality. However, PDF is more suitable for sharing and printing due to its support for high-quality printing, encryption, digital signatures, and other features. Converting SVG to PDF ensures good image display on different devices and environments, and better protects intellectual property.  In this tutorial, we will show you how to convert SVG to PDF and how to add a SVG image to PDF in C# and VB.NET using Spire.PDF for .NET.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for .NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.PDF

Convert SVG to PDF in C# and VB.NET

Spire.PDF for .NET provides the PdfDocument.SaveToFile(String, FileFormat) method, which allows users to save an SVG file as a PDF. The detailed steps are as follows.

  • Create a PdfDocument object.
  • Load a sample SVG file using PdfDocument.LoadFromFile() method.
  • Convert the SVG file to PDF using PdfDocument.SaveToFile(String, FileFormat) method.
  • C#
  • VB.NET
using Spire.Pdf;

namespace SVGtoPDF
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument object
            PdfDocument doc = new PdfDocument();

            //Load a sample SVG file
            doc.LoadFromSvg("Sample.svg");

            //Save result document
            doc.SaveToFile("Result.pdf", FileFormat.PDF);
            doc.Dispose();
        }
    }
}

C#/VB.NET: Convert SVG to PDF

Add SVG image to PDF in C# and VB.NET

In addition to converting SVG to PDF directly, it also supports adding SVG image files to the specified locations in PDF. Please check the steps as below:

  • Create a PdfDocument object and load an SVG file using PdfDocument. LoadFromSvg() method.
  • Create a template based on the content of the SVG file using PdfDocument. Pages[].CreateTemplate() method.
  • Get the width and height of the template on the page.
  • Create another PdfDocument object and load a PDF file using PdfDocument.LoadFromFile() method.
  • Draw the template with a custom size at a specified location using PdfDocument.Pages[].Canvas.DrawTemplate() method.
  • Save to PDF file using PdfDocument.SaveToFile(String, FileFormat) method.
  • C#
  • VB.NET
using Spire.Pdf;
using Spire.Pdf.Graphics;
using System.Drawing;

namespace AddSVGImagetoPDF
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument object
            PdfDocument doc1 = new PdfDocument();

            //Load an SVG file
            doc1.LoadFromSvg("C:\\Users\\Administrator\\Desktop\\sample.svg");

            //Create a template based on the content of the SVG file
            PdfTemplate template = doc1.Pages[0].CreateTemplate();

            //Get the width and height of the template 
            float width = template.Width;
            float height = template.Height;

            //Create another PdfDocument object
            PdfDocument doc2 = new PdfDocument();

            //Load a PDF file
            doc2.LoadFromFile("C:\\Users\\Administrator\\Desktop\\sample.pdf");

            //Draw the template with a custom size at a specified location 
            doc2.Pages[0].Canvas.DrawTemplate(template, new PointF(0, 0), new SizeF(width * 0.8f, height * 0.8f));

            //Save to PDF file
            doc2.SaveToFile("AddSvgToPdf.pdf", FileFormat.PDF);
            doc2.Dispose();
        }
    }
}

C#/VB.NET: Convert SVG to PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Conversion

PostScript was developed by Adobe Systems in the 1980s as a way of turning digital graphics or text files into a fixed format ready for printing. With the passage time, although the PostScript (PS) file format is no longer as popular as it once was, now it is still supported by most printers. In this article, you will learn how to how to programmatically convert a PDF file to a PostScript (PS) file using Spire.PDF for .NET.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.PDF 

Convert PDF to PostScript in C# and VB.NET

Converting PDF to PS can improve the quality of the printed output. With Spire.PDF for .NET, you can complete the conversion with only three lines of code. The following are the detailed steps.

  • Create a PdfDocument instance.
  • Load a sample PDF file using PdfDocument.LoadFromFile() method.
  • Save the PDF file as a PS file using PdfDocument.SaveToFile(string filename, FileFormat.POSTSCRIPT) method.
  • C#
  • VB.NET
using Spire.Pdf;

namespace PDFtoPS
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument instance
            PdfDocument document = new PdfDocument();

            //Load a sample PDF file
            document.LoadFromFile("Test.pdf");

            //Save the PDF file as a PS file
            document.SaveToFile("toPostScript.ps", FileFormat.POSTSCRIPT);
        }
    }
}

C#/VB.NET: Convert PDF to PostScript (PS)

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Conversion
Wednesday, 06 March 2019 07:45

Convert PDF to PCL in C#/VB.NET

A PCL file is a Printer Command Language document. Printer Command Language is a page description language developed by HP as a printer protocol and has been widely supported in many printers now. Start from version 5.2.3, Spire.PDF supports converting PDF file to PCL format. There are six major levels of PCL, the PCL here refers to PCL 6 (PCL 6 Enhanced or PCL XL).

Example code

[C#]
using Spire.Pdf;

namespace PDFtoPCL
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument instance
            PdfDocument pdf = new PdfDocument();
            //Load the PDF file
            pdf.LoadFromFile("Input.pdf");
            //Save to PCL format
            pdf.SaveToFile("ToPCL.pcl", FileFormat.PCL);
        }
    }
}
[VB.NET]
Imports Spire.Pdf

Namespace PDFtoPCL
	Class Program
		Private Shared Sub Main(args As String())
			Dim pdf As PdfDocument = New PdfDocument()
    pdf.LoadFromFile("Input.pdf")
    pdf.SaveToFile("ToPCL.pcl", FileFormat.PCL)
		End Sub
	End Class
End Namespace
Published in Conversion

This article we will demonstrate how to convert the PDF pages to HTML, Word, SVG, XPS, PDF and save them to stream by calling the method PdfDocument.SaveToStream() offered by Spire.PDF. And starts from Spire.PDF version 4.3, it newly supports to convert the defined range of PDF pages and save them to stream.

Save the PDF to stream

Step 1: Create a new PdfDocument instance and load the sample document from file.

PdfDocument pdf = new PdfDocument();
pdf.LoadFromFile("Sample.pdf");

Step 2: Save the document to stream.

MemoryStream ms=new MemoryStream ();
pdf.SaveToStream(ms);

Save the PDF to stream and defined the file format to HTML, Word, SVG, XPS and PDF

Step 1: Create a new PdfDocument instance and load the sample document from file.

PdfDocument pdf = new PdfDocument();
pdf.LoadFromFile("Sample.pdf");

Step 2: Save the document to stream and use FileFormat format to define the format.

MemoryStream ms=new MemoryStream ();
pdf.SaveToStream(ms, FileFormat.HTML);

Convert the defined range of PDF pages to HTML, word, SVG, XPS and save them to stream

Step 1: Create a new PdfDocument instance and load the sample document from file.

PdfDocument pdf = new PdfDocument();
pdf.LoadFromFile("Sample.pdf"); 

Step 2: Only save some PDF pages to stream by using pdf.SaveToStream(int startIndex, int endIndex, FileFormat format) method; and FileFormat.PDF is not supported.

pdf.SaveToStream(1, 2, FileFormat.SVG);

Full codes of save PDF to stream:

using Spire.Pdf;
using System.IO;


namespace SavePDFToStream
{
    class Program
    {
        static void Main(string[] args)
        {
            PdfDocument pdf = new PdfDocument();

            pdf.LoadFromFile("Sample.pdf");

            MemoryStream ms = new MemoryStream();
            pdf.SaveToStream(ms);
            pdf.SaveToStream(ms, FileFormat.HTML);

            pdf.SaveToStream(1, 2, FileFormat.SVG);
        }
    }
}
Published in Conversion
Page 1 of 3