Thursday, 19 January 2023 08:14

C#/VB.NET: Extract Attachments from PDF

PDF attachments allow users to see more details on a particular point by visiting attachments inside the PDF. Basically, there are two types of attachments in PDF: document level attachment and annotation attachment. Below are the differences between them.

  • Document Level Attachment (represented by PdfAttachment class): A file attached to a PDF at the document level won't appear on a page, but only appear in the PDF reader's "Attachments" panel.
  • Annotation Attachment (represented by PdfAttachmentAnnotation class): A file that is attached to a specific position of a page. Annotation attachments are shown as a paper clip icon on the page; reviewers can double-click the icon to open the file.

In this article, you will learn how to extract these two kinds of attachments from a PDF document in C# and VB.NET using Spire.PDF for .NET.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.PDF

Extract Attachments from PDF in C# and VB.NET

The document level attachments of a PDF document can be obtained through PdfDocument.Attachments property. The following steps illustrate how to extract all document level attachments from a PDF document and save them to a local folder.

  • Create a PdfDocument object.
  • Load a PDF file using PdfDocument.LoadFromFile() method.
  • Get the attachment collection from the document through PdfDocument.Attachments property.
  • Get the data of a specific attachment through PdfAttachment.Data property.
  • Write the data to a file and save to a specified folder.
  • C#
  • VB.NET
using Spire.Pdf;
using Spire.Pdf.Attachments;
using System.Net.Mail;

namespace ExtractAttachments
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument object
            PdfDocument doc = new PdfDocument();

            //Load a PDF file that contains attachments
            doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Attachments.pdf");

            //Get the attachment collection of the PDF document
            PdfAttachmentCollection attachments = doc.Attachments;

            //Specific output folder path
            string outputFolder = "C:\\Users\\Administrator\\Desktop\\output\\";

            //Loop through the collection
            for (int i = 0; i < attachments.Count; i++)
            {   
	          //Write attachment to a file
                File.WriteAllBytes(outputFolder + attachments[i].FileName, attachments[i].Data);
            }
        }
    }
}

C#/VB.NET: Extract Attachments from PDF

Extract Annotation Attachments from PDF in C# and VB.NET

Annotation attachment is a page-based element. To get annotations from a specific page, use PdfPageBase.AnnotationsWidget property. After that, you’ll need to determine if a specific annotation is an annotation attachment. The follows are the steps to extract annotation attachments from a PDF document and save them to a local folder.

  • Create a PdfDocument object.
  • Load a PDF file using PdfDocument.LoadFromFile() method.
  • Get a specific page from the document through PdfDocument.Pages[] property.
  • Get the annotation collection from the page through PdfPageBase.AnnotationsWidget property.
  • Determine if a specific annotation is an instance of PdfAttachmentAnnotationWidget. If yes, write the annotation attachment to a file and save it to a specified folder.
  • C#
  • VB.NET
using Spire.Pdf;
using Spire.Pdf.Annotations;

namespace ExtractAnnotationAttachments
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument object
            PdfDocument doc = new PdfDocument();

            //Load a PDF file that contains attachments
            doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\AnnotationAttachments.pdf");

            //Specific output folder path
            string outputFolder = "C:\\Users\\Administrator\\Desktop\\Output\\";

            //Loop through the pages
            for (int i = 0; i < doc.Pages.Count; i++)
            {
                //Get the annotation collection
                PdfAnnotationCollection collection = doc.Pages[i].AnnotationsWidget;

                //Loop through the annotations
                for (int j = 0; j < collection.Count; j++)
                {
                    //Determine if an annotation is an instance of PdfAttachmentAnnotationWidget
                    if (collection[j] is PdfAttachmentAnnotationWidget)
                    {
                        //Write annotation attachment to a file
                        PdfAttachmentAnnotationWidget attachmentAnnotation = (PdfAttachmentAnnotationWidget)collection[j];
                        String fileName = Path.GetFileName(attachmentAnnotation.FileName);
                        File.WriteAllBytes(outputFolder + fileName, attachmentAnnotation.Data);
                    }
                }
            }
        }
    }
}

C#/VB.NET: Extract Attachments from PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Attachments
Monday, 26 December 2022 05:58

C#/VB.NET: Add or Remove Attachments in PDF

Besides text and graphics, PDF files can contain entire files inside them as attachments. This makes the exchange of sets of documents much easier and more reliable. Spire.PDF allows you to attach files in two ways:

  • Document Level Attachment: A file attached to a PDF at the document level won't appear on a page, but can only be viewed in the "Attachments" panel of a PDF reader.
  • Annotation Attachment: A file will be added to a specific position of a page. Annotation attachments are shown as a paper clip icon on the page; reviewers can double-click the icon to open the file.

This article demonstrates how to add or remove these two types of attachments in a PDF document in C# and VB.NET using Spire.PDF for .NET.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.PDF

Add an Attachment to PDF in C# and VB.NET

Adding an attachment to the "Attachments" panel can be easily done by using PdfDocument.Attachments.Add() method. The following are the detailed steps.

  • Create a PdfDocument object.
  • Load a PDF document using PdfDocument.LoadFromFile() method.
  • Create a PdfAttachment object based on an external file.
  • Add the attachment to PDF using PdfDocument.Attachments.Add() method.
  • Save the document to another PDF file using PdfDocument.SaveToFile() method.
  • C#
  • VB.NET
using Spire.Pdf;
using Spire.Pdf.Attachments;

namespace AttachFilesToPDF
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument object
            PdfDocument doc = new PdfDocument();

            //Load a sample PDF file
            doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Sample.pdf");

            //Create a PdfAttachment object based on an external file
            PdfAttachment attachment = new PdfAttachment("C:\\Users\\Administrator\\Desktop\\Data.xlsx");

            //Add the attachment to PDF
            doc.Attachments.Add(attachment);

            //Save to file
            doc.SaveToFile("Attachment.pdf");
        }
    }
}

C#/VB.NET: Add or Remove Attachments in PDF

Add an Annotation Attachment to PDF in C# and VB.NET

An annotation attachment can be found in the "Attachments" panel as well as on a specific page. Below are the steps to add an annotation attachment to PDF using Spire.PDF for .NET.

  • Create a PdfDocument object.
  • Load a PDF document using PdfDocument.LoadFromFile() method.
  • Get a specific page to add annotation through PdfDocument.Pages[] property.
  • Create a PdfAttachmentAnnotation object based on an external file.
  • Add the annotation attachment to the page using PdfPageBase.AnnotationsWidget.Add() method.
  • Save the document to another PDF file using PdfDocument.SaveToFile() method.
  • C#
  • VB.NET
using Spire.Pdf;
using Spire.Pdf.Annotations;
using Spire.Pdf.Graphics;
using System;
using System.Drawing;
using System.IO;

namespace AnnotationAttachment
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument object
            PdfDocument doc = new PdfDocument();

            //Load a sample PDF file
            doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Sample.pdf");

            //Get a specific page
            PdfPageBase page = doc.Pages[0];

            //Draw a label on PDF
            String label = "Here is the report:";
            PdfTrueTypeFont font = new PdfTrueTypeFont(new Font("Arial", 13f, FontStyle.Bold), true); 
            float x = 35;
            float y = doc.Pages[0].ActualSize.Height - 220;
            page.Canvas.DrawString(label, font, PdfBrushes.Red, x, y);

            //Create a PdfAttachmentAnnotation object based on an external file
            String filePath = "C:\\Users\\Administrator\\Desktop\\Report.pptx";
            byte[] data = File.ReadAllBytes(filePath);
            SizeF size = font.MeasureString(label);
            RectangleF bounds = new RectangleF((float)(x + size.Width + 5), (float)y, 10, 15);
            PdfAttachmentAnnotation annotation = new PdfAttachmentAnnotation(bounds, "Report.docx", data);
            annotation.Color = Color.Purple;
            annotation.Flags = PdfAnnotationFlags.Default;
            annotation.Icon = PdfAttachmentIcon.Graph;
            annotation.Text = "Click here to open the file";

            //Add the attachment annotation to PDF
            page.AnnotationsWidget.Add(annotation);

            //Save to file
            doc.SaveToFile("Annotation.pdf");
        }
    }
}

C#/VB.NET: Add or Remove Attachments in PDF

Remove Attachments from PDF in C# and VB.NET

The attachments of a PDF document can be accessed by the PdfDocument.Attachments property, and can be removed by using RemoveAt() method or Clear() method of the PdfAttachmentCollection object. The detailed steps are as follows.

  • Create a PdfDocument object.
  • Load a PDF document using PdfDocument.LoadFromFile() method.
  • Get the attachment collection from the document through PdfDocument.Attachments property.
  • Remove a specific attachment using PdfAttachmentCollection.RemoveAt() method. To remove all attachments at once, use PdfAttachmentCollection.Clear() method.
  • Save the document to another PDF file using PdfDocument.SaveToFile() method.
  • C#
  • VB.NET
using Spire.Pdf;
using Spire.Pdf.Attachments;

namespace RemoveAttachments
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument object
            PdfDocument doc = new PdfDocument();

            //Load a PDF file
            doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Attachment.pdf");

            //Get attachment collection
            PdfAttachmentCollection attachments = doc.Attachments;

            //Remove a specific attachment
            attachments.RemoveAt(0);

            //Remove all attachments
            //attachments.Clear();

            //save to file
            doc.SaveToFile("DeleteAttachments.pdf");
        }
    }
}

Remove Annotation Attachments from PDF in C# and VB.NET

Annotation is a page-based element. To get all annotations from a document, we must traverse through the pages and get the annotations from each page. Then determine if a certain annotation is an annotation attachment. Lastly, remove the annotation attachment from the annotation collection using Remove() method.  The following are the detailed steps.

  • Create a PdfDocument object.
  • Load a PDF document using PdfDocument.LoadFromFile() method.
  • Loop through the pages in the document, and get the annotation collection from a specific page through PdfDocument.Pages[].AnnotationsWidget property.
  • Determine if an annotation is an instance of PdfAttachmentAnnotationWidget. If yes, remove the annotation attachment using PdfAnnotationCollection.Remove() method.
  • Save the document to another PDF file using PdfDocument.SaveToFile() method.
  • C#
  • VB.NET
using Spire.Pdf;
using Spire.Pdf.Annotations;

namespace RemoveAnnotationAttachments
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a PdfDocument object
            PdfDocument doc = new PdfDocument();

            //Load a PDF file
            doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Annotation.pdf");

            //Loop through the pages
            for (int i = 0; i < doc.Pages.Count; i++)
            {
                //Get the annotation collection
                PdfAnnotationCollection annotationCollection = doc.Pages[i].AnnotationsWidget;

                //Loop through the annotations
                for (int j = 0; j < annotationCollection.Count; j++)
                {
                    //Determine if an annotation is an instance of PdfAttachmentAnnotationWidget
                    if (annotationCollection[j] is PdfAttachmentAnnotationWidget)
                    {
                        //Remove the annotation attachment 
                        annotationCollection.Remove((PdfAnnotation)annotationCollection[j]);
                    }
                }
            }

            //Save to file
            doc.SaveToFile("DeleteAnnotationAttachments.pdf");
        }
    }
}

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Attachments