C#/VB.NET: Extract Attachments from PDF

2023-01-19 08:14:00 Written by  support iceblue
Rate this item
(0 votes)

PDF attachments allow users to see more details on a particular point by visiting attachments inside the PDF. Basically, there are two types of attachments in PDF: document level attachment and annotation attachment. Below are the differences between them.

  • Document Level Attachment (represented by PdfAttachment class): A file attached to a PDF at the document level won't appear on a page, but only appear in the PDF reader's "Attachments" panel.
  • Annotation Attachment (represented by PdfAttachmentAnnotation class): A file that is attached to a specific position of a page. Annotation attachments are shown as a paper clip icon on the page; reviewers can double-click the icon to open the file.

In this article, you will learn how to extract these two kinds of attachments from a PDF document in C# and VB.NET using Spire.PDF for .NET.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.PDF

Extract Attachments from PDF in C# and VB.NET

The document level attachments of a PDF document can be obtained through PdfDocument.Attachments property. The following steps illustrate how to extract all document level attachments from a PDF document and save them to a local folder.

  • Create a PdfDocument object.
  • Load a PDF file using PdfDocument.LoadFromFile() method.
  • Get the attachment collection from the document through PdfDocument.Attachments property.
  • Get the data of a specific attachment through PdfAttachment.Data property.
  • Write the data to a file and save to a specified folder.
  • C#
  • VB.NET
using Spire.Pdf;
using Spire.Pdf.Attachments;
using System.Net.Mail;

namespace ExtractAttachments
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument object
            PdfDocument doc = new PdfDocument();

            //Load a PDF file that contains attachments
            doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Attachments.pdf");

            //Get the attachment collection of the PDF document
            PdfAttachmentCollection attachments = doc.Attachments;

            //Specific output folder path
            string outputFolder = "C:\\Users\\Administrator\\Desktop\\output\\";

            //Loop through the collection
            for (int i = 0; i < attachments.Count; i++)
            {   
	          //Write attachment to a file
                File.WriteAllBytes(outputFolder + attachments[i].FileName, attachments[i].Data);
            }
        }
    }
}

C#/VB.NET: Extract Attachments from PDF

Extract Annotation Attachments from PDF in C# and VB.NET

Annotation attachment is a page-based element. To get annotations from a specific page, use PdfPageBase.AnnotationsWidget property. After that, you’ll need to determine if a specific annotation is an annotation attachment. The follows are the steps to extract annotation attachments from a PDF document and save them to a local folder.

  • Create a PdfDocument object.
  • Load a PDF file using PdfDocument.LoadFromFile() method.
  • Get a specific page from the document through PdfDocument.Pages[] property.
  • Get the annotation collection from the page through PdfPageBase.AnnotationsWidget property.
  • Determine if a specific annotation is an instance of PdfAttachmentAnnotationWidget. If yes, write the annotation attachment to a file and save it to a specified folder.
  • C#
  • VB.NET
using Spire.Pdf;
using Spire.Pdf.Annotations;

namespace ExtractAnnotationAttachments
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument object
            PdfDocument doc = new PdfDocument();

            //Load a PDF file that contains attachments
            doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\AnnotationAttachments.pdf");

            //Specific output folder path
            string outputFolder = "C:\\Users\\Administrator\\Desktop\\Output\\";

            //Loop through the pages
            for (int i = 0; i < doc.Pages.Count; i++)
            {
                //Get the annotation collection
                PdfAnnotationCollection collection = doc.Pages[i].AnnotationsWidget;

                //Loop through the annotations
                for (int j = 0; j < collection.Count; j++)
                {
                    //Determine if an annotation is an instance of PdfAttachmentAnnotationWidget
                    if (collection[j] is PdfAttachmentAnnotationWidget)
                    {
                        //Write annotation attachment to a file
                        PdfAttachmentAnnotationWidget attachmentAnnotation = (PdfAttachmentAnnotationWidget)collection[j];
                        String fileName = Path.GetFileName(attachmentAnnotation.FileName);
                        File.WriteAllBytes(outputFolder + fileName, attachmentAnnotation.Data);
                    }
                }
            }
        }
    }
}

C#/VB.NET: Extract Attachments from PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Additional Info

  • tutorial_title:
Last modified on Tuesday, 20 June 2023 02:06