Python: Import and Export PDF Form Data
PDF forms are essential tools for collecting information across various industries. Understanding how to import and export this data in different formats like FDF, XFDF, and XML can greatly enhance your data management processes. For instance, importing form data allows you to update or pre-fill PDF forms with existing information, saving time and increasing accuracy. Conversely, exporting form data enables you to share collected information effortlessly with other applications, facilitating seamless integration and minimizing manual entry errors. In this article, we will introduce how to import and export PDF form data in Python using Spire.PDF for Python.
- Import PDF Form Data from FDF, XFDF or XML Files in Python
- Export PDF Form Data to FDF, XFDF or XML Files in Python
Install Spire.PDF for Python
This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.PDF
If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows
Import PDF Form Data from FDF, XFDF or XML Files in Python
Spire.PDF for Python offers the PdfFormWidget.ImportData() method for importing PDF form data from FDF, XFDF, or XML files. The detailed steps are as follows.
- Create an object of the PdfDocument class.
- Load a PDF document using PdfDocument.LoadFromFile() method.
- Get the form of the PDF document using PdfDocument.Form property.
- Import form data from an FDF, XFDF or XML file using PdfFormWidget.ImportData() method.
- Save the resulting document using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import * from spire.pdf import * # Create an object of the PdfDocument class pdf = PdfDocument() # Load a PDF document pdf.LoadFromFile("Forms.pdf") # Get the form of the document pdfForm = pdf.Form formWidget = PdfFormWidget(pdfForm) # Import PDF form data from an XML file formWidget.ImportData("Data.xml", DataFormat.Xml) # Import PDF form data from an FDF file # formWidget.ImportData("Data.fdf", DataFormat.Fdf) # Import PDF form data from an XFDF file # formWidget.ImportData("Data.xfdf", DataFormat.XFdf) # Save the resulting document pdf.SaveToFile("Output.pdf") # Close the PdfDocument object pdf.Close()
Export PDF Form Data to FDF, XFDF or XML Files in Python
Spire.PDF for Python also enables developers to export PDF form data to FDF, XFDF, or XML files by using the PdfFormWidget.ExportData() method. The detailed steps are as follows.
- Create an object of the PdfDocument class.
- Load a PDF document using PdfDocument.LoadFromFile() method.
- Get the form of the PDF document using PdfDocument.Form property.
- Export form data to an FDF, XFDF or XML file using PdfFormWidget.ExportData() method.
- Python
from spire.pdf.common import * from spire.pdf import * # Create an object of the PdfDocument class pdf = PdfDocument() # Load a PDF document pdf.LoadFromFile("Forms.pdf") # Get the form of the document pdfForm = pdf.Form formWidget = PdfFormWidget(pdfForm) # Export PDF form data to an XML file formWidget.ExportData("Data.xml", DataFormat.Xml, "Form") # Export PDF form data to an FDF file # formWidget.ExportData("Data.fdf", DataFormat.Fdf, "Form") # Export PDF form data to an XFDF file # formWidget.ExportData("Data.xfdf", DataFormat.XFdf, "Form") # Close the PdfDocument object pdf.Close()
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Python: Remove Forms from a PDF Document
Interactive forms in PDFs are valuable tools that allow users to fill in information, complete surveys, or sign documents electronically. However, these forms can also add layers of complexity to a PDF, impacting both file size and the overall user experience. When forms are no longer needed, or when a document needs to be simplified for distribution or archiving, removing these interactive elements can be beneficial. In this article, we will demonstrate how to remove forms from a PDF document in Python using Spire.PDF for Python.
Install Spire.PDF for Python
This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.PDF
If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows
Remove a Specific Form from a PDF Document in Python
Spire.PDF for Python allows you to remove specific form fields from a PDF file by using either the indexes or the names of the form fields. The detailed steps are as follows.
- Create an instance of the PdfDocument class.
- Load a PDF document containing form fields using the PdfDocument.LoadFromFile() method.
- Get the form of the document using the PdfDocument.Form property.
- Get the form field collection using the PdfFormWidget.FieldsWidget property.
- Remove a specific form field by its index using the PdfFormFieldWidgetCollection.RemoveAt(index) method. Or retrieve a form field by its name using the PdfFormFieldWidgetCollection[name] property, and then remove it using the PdfFormFieldWidgetCollection.Remove(field) method.
- Save the resulting document using the PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import * from spire.pdf import * # Create an instance of the PdfDocument class doc = PdfDocument() # Load a PDF file doc.LoadFromFile("Forms.pdf") # Get the form of the document pdfForm = doc.Form formWidget = PdfFormWidget(pdfForm) # Get the form field collection field_collection = formWidget.FieldsWidget # Remove a specific form field by its index field_collection.RemoveAt(0) # Or remove a specific form field by its name # text_box = field_collection["Name"] # field_collection.Remove(text_box) # Save the resulting document doc.SaveToFile("remove_specific_form.pdf") doc.Close()
Remove All Forms from a PDF Document in Python
To remove all form fields from a PDF document, you need to iterate through the form field collection, and then remove each form field from the collection using the PdfFormFieldWidgetCollection.RemoveAt(index) method. The detailed steps are as follows.
- Create an instance of the PdfDocument class.
- Load a PDF document containing form fields using the PdfDocument.LoadFromFile() method.
- Get the form of the document using the PdfDocument.Form property.
- Get the form field collection using the PdfFormWidget.FieldsWidget property.
- Iterate through all form fields in the collection.
- Remove each form field from the collection using the PdfFormFieldWidgetCollection.RemoveAt(index) method.
- Save the resulting document using the PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import * from spire.pdf import * # Create an instance of the PdfDocument class doc = PdfDocument() # Load a PDF file doc.LoadFromFile("Forms.pdf") # Get the form of the document pdfForm = doc.Form formWidget = PdfFormWidget(pdfForm) # Get the form field collection field_collection = formWidget.FieldsWidget # Check if there are any form fields in the collection if field_collection.Count > 0: # Iterate through all form fields in the collection for i in range(field_collection.Count - 1, -1, -1): # Remove the current form field from the collection field_collection.RemoveAt(i) # Save the resulting document doc.SaveToFile("remove_all_forms.pdf") doc.Close()
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Python: Flatten Forms in PDF
Flattening forms in PDF means transforming the interactive form fields (such as text boxes, checkboxes, and drop-down menus) into static content. Once a form is flattened, it cannot be edited or filled out anymore. When you need to maintain a permanent and unalterable record of a completed form, flattening is essential. This ensures that the data entered into the form fields cannot be modified or tampered with, providing a reliable reference for future use. In this article, we will demonstrate how to flatten forms in PDF in Python using Spire.PDF for Python.
Install Spire.PDF for Python
This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.PDF
If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows
Flatten All Forms in a PDF in Python
Spire.PDF for Python provides the PdfDocument.Form.IsFlatten property, which enables you to flatten all forms in a PDF file. The detailed steps are as follows.
- Create an object of the PdfDocument class.
- Load a PDF file using PdfDocument.LoadFromFile() method.
- Flatten all forms in the PDF file by setting the PdfDocument.Form.IsFlatten property to True.
- Save the result file using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import * from spire.pdf import * # Specify the input and output PDF file paths input_file = "Form.pdf" output_file = "FlattenAll.pdf" # Create an object of the PdfDocument class doc = PdfDocument() # Load a PDF file doc.LoadFromFile(input_file) # Flatten all forms in the PDF file doc.Form.IsFlatten = True # Save the result file doc.SaveToFile(output_file) doc.Close()
Flatten a Specific Form in a PDF in Python
To flatten a specific form in a PDF file, you can use the PdfField.Flatten property. The detailed steps are as follows.
- Create an object of the PdfDocument class.
- Load a PDF file using the PdfDocument.LoadFromFile() method.
- Get the forms of the PDF file using PdfDocument.Form property.
- Get a specific form by its index or name using PdfFormWidget.FieldsWidget.get_Item() method.
- Flatten the form by setting the PdfField.Flatten property to True.
- Save the result file using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import * from spire.pdf import * # Specify the input and output PDF file paths input_file = "Form.pdf" output_file = "FlattenSpecific.pdf" # Create an object of the PdfDocument class doc = PdfDocument() # Load a PDF file doc.LoadFromFile(input_file) # Get the forms of the PDF file loadedForm = doc.Form # Get a specific form by its index or name formWidget = PdfFormWidget(loadedForm) form = formWidget.FieldsWidget.get_Item(2) # form = formWidget.FieldsWidget.get_Item("Address") # Flatten the specific form form.Flatten = True # Save the result file doc.SaveToFile(output_file) doc.Close()
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Python: Extract Form Field Values from PDF
PDF forms are commonly used to collect user information, and extracting form values programmatically allows for automated processing of submitted data, ensuring accurate data collection and analysis. After extraction, you can generate reports based on form field values or migrate them to other systems or databases. In this article, you will learn how to extract form field values from PDF with Python using Spire.PDF for Python.
Install Spire.PDF for Python
This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.PDF
If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows
Extract Form Field Values from PDF with Python
Spire.PDF for Python supports various types of PDF form fields, including:
- Text box field (represented by the PdfTextBoxFieldWidget class)
- Check box field (represented by the PdfCheckBoxWidgetFieldWidget class)
- Radio button field (represented by the PdfRadioButtonListFieldWidget class)
- List box field (represented by the PdfListBoxWidgetFieldWidget class)
- Combo box field (represented by the PdfComboBoxWidgetFieldWidget class)
Before extracting data from the PDF forms, it is necessary to determine the specific type of each form field first, and then you can use the properties of the corresponding form field class to extract their values accurately. The following are the detailed steps.
- Initialize an instance of the PdfDocument class.
- Load a PDF document using PdfDocument.LoadFromFile() method.
- Get the form in the PDF document using PdfDocument.Form property.
- Create a list to store the extracted form field values.
- Iterate through all fields in the PDF form.
- Determine the types of the form fields, then get the names and values of the form fields using the corresponding properties.
- Write the results to a text file.
- Python
from spire.pdf.common import * from spire.pdf import * inputFile = "Forms.pdf" outputFile = "GetFormFieldValues.txt" # Create a PdfDocument instance pdf = PdfDocument() # Load a PDF document pdf.LoadFromFile(inputFile) # Get PDF forms pdfform = pdf.Form formWidget = PdfFormWidget(pdfform) sb = [] # Iterate through all fields in the form if formWidget.FieldsWidget.Count > 0: for i in range(formWidget.FieldsWidget.Count): field = formWidget.FieldsWidget.get_Item(i) # Get the name and value of the textbox field if isinstance(field, PdfTextBoxFieldWidget): textBoxField = field if isinstance(field, PdfTextBoxFieldWidget) else None name = textBoxField.Name value = textBoxField.Text sb.append("Textbox Name: " + name + "\r") sb.append("Textbox Name " + value + "\r\n") # Get the name of the listbox field if isinstance(field, PdfListBoxWidgetFieldWidget): listBoxField = field if isinstance(field, PdfListBoxWidgetFieldWidget) else None name = listBoxField.Name sb.append("Listbox Name: " + name + "\r") # Get the items of the listbox field sb.append("Listbox Items: \r") items = listBoxField.Values for i in range(items.Count): item = items.get_Item(i) sb.append(item.Value + "\r") # Get the selected item of the listbox field selectedValue = listBoxField.SelectedValue sb.append("Listbox Selected Value: " + selectedValue + "\r\n") # Get the name of the combo box field if isinstance(field, PdfComboBoxWidgetFieldWidget): comBoxField = field if isinstance(field, PdfComboBoxWidgetFieldWidget) else None name = comBoxField.Name sb.append("Combobox Name: " + name + "\r"); # Get the items of the combo box field sb.append("Combobox Items: \r"); items = comBoxField.Values for i in range(items.Count): item = items.get_Item(i) sb.append(item.Value + "\r") # Get the selected item of the combo box field selectedValue = comBoxField.SelectedValue sb.append("Combobox Selected Value: " + selectedValue + "\r\n") # Get the name and selected item of the radio button field if isinstance(field, PdfRadioButtonListFieldWidget): radioBtnField = field if isinstance(field, PdfRadioButtonListFieldWidget) else None name = radioBtnField.Name selectedValue = radioBtnField.SelectedValue sb.append("Radio Button Name: " + name + "\r"); sb.append("Radio Button Selected Value: " + selectedValue + "\r\n") # Get the name and status of the checkbox field if isinstance(field, PdfCheckBoxWidgetFieldWidget): checkBoxField = field if isinstance(field, PdfCheckBoxWidgetFieldWidget) else None name = checkBoxField.Name sb.append("Checkbox Name: " + name + "\r") state = checkBoxField.Checked stateValue = "Yes" if state else "No" sb.append("If the checkBox is checked: " + stateValue + "\r\n") # Write the results to a text file f2=open(outputFile,'w', encoding='UTF-8') for item in sb: f2.write(item) f2.close() pdf.Close()
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Python: Create or Fill in a Form in PDF
Creating a form in PDF not only ensures a professional appearance but also allows users to fill out and submit data electronically, streamlining data entry processes. Whether you are collecting survey responses, gathering client information, or creating employment applications, the ability to generate interactive PDF forms offers a seamless and organized way to capture, store, and manage valuable data. In this article, you will learn how to create a fillable PDF form as well as how to fill in a PDF form using Spire.PDF for Python.
Install Spire.PDF for Python
This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.PDF
If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows
Create a Fillable Form in PDF in Python
Spire.PDF for Python provides a range of helpful classes that enable programmers to generate and modify different types of form fields in PDF files. These include text boxes, check boxes, combo boxes, list boxes, and radio buttons. The table below lists some of the classes involved in this tutorial.
Class | Description |
PdfForm | Represents interactive form of the PDF document. |
PdfField | Represents field of the PDF document's interactive form. |
PdfTextBoxField | Represents text box field in the PDF form. |
PdfCheckBoxField | Represents check box field in the PDF form. |
PdfComboBoxField | Represents combo box field in the PDF Form. |
PdfListBoxField | Represents list box field of the PDF form. |
PdfListFieldItem | Represents an item of a list field. |
PdfRadioButtonListField | Represents radio button field in the PDF form. |
PdfRadioButtonListItem | Represents an item of a radio button list. |
PdfButtonField | Represents button field in the PDF form. |
To generate a PDF form, start by creating an instance of the respective field class. Set the field's size and position in the document using the Bounds property, and finally, add it to the PDF using the PdfFormFieldCollection.Add() method. The following are the main steps to create various types of form fields in a PDF document using Spire.PDF for Python.
- Create a PdfDocument object.
- Add a page using PdfDocuemnt.Pages.Add() method.
- Create a PdfTextBoxField object, set the properties of the field including Bounds, Font and Text, and then add it to the document using PdfFormFieldCollection.Add() method.
- Repeat the step 3 to add check box, combo box, list box, radio button, and button to the document.
- Save the document to a PDF file using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import * from spire.pdf import * # Create a PdfDocument object doc = PdfDocument() # Add a page page = doc.Pages.Add() # Initialize x and y coordinates baseX = 100.0 baseY = 30.0 # Create two brush objects brush1 = PdfSolidBrush(PdfRGBColor(Color.get_Blue())) brush2 = PdfSolidBrush(PdfRGBColor(Color.get_Black())) # Create a font font = PdfFont(PdfFontFamily.TimesRoman, 12.0, PdfFontStyle.Regular) # Add a textbox page.Canvas.DrawString("Name:", font, brush1, PointF(10.0, baseY)) tbxBounds = RectangleF(baseX, baseY, 150.0, 15.0) textBox = PdfTextBoxField(page, "name") textBox.Bounds = tbxBounds textBox.Font = font doc.Form.Fields.Add(textBox) baseY += 30.0 # add two checkboxes page.Canvas.DrawString("Gender:", font, brush1, PointF(10.0, baseY)); checkboxBound1 = RectangleF(baseX, baseY, 15.0, 15.0) checkBoxField1 = PdfCheckBoxField(page, "male") checkBoxField1.Bounds = checkboxBound1 checkBoxField1.Checked = False page.Canvas.DrawString("Male", font, brush2, PointF(baseX + 20.0, baseY)) checkboxBound2 = RectangleF(baseX + 70.0, baseY, 15.0, 15.0) checkBoxField2 = PdfCheckBoxField(page, "female") checkBoxField2.Bounds = checkboxBound2 checkBoxField2.Checked = False page.Canvas.DrawString("Female", font, brush2, PointF(baseX + 90.0, baseY)) doc.Form.Fields.Add(checkBoxField1) doc.Form.Fields.Add(checkBoxField2) baseY += 30.0 # Add a listbox page.Canvas.DrawString("Country:", font, brush1, PointF(10.0, baseY)) listboxBound = RectangleF(baseX, baseY, 150.0, 50.0) listBoxField = PdfListBoxField(page, "country") listBoxField.Items.Add(PdfListFieldItem("USA", "usa")) listBoxField.Items.Add(PdfListFieldItem("Canada", "canada")) listBoxField.Items.Add(PdfListFieldItem("Mexico", "mexico")) listBoxField.Bounds = listboxBound listBoxField.Font = font doc.Form.Fields.Add(listBoxField) baseY += 60.0 # Add two radio buttons page.Canvas.DrawString("Hobbies:", font, brush1, PointF(10.0, baseY)) radioButtonListField = PdfRadioButtonListField(page, "hobbies") radioItem1 = PdfRadioButtonListItem("travel") radioBound1 = RectangleF(baseX, baseY, 15.0, 15.0) radioItem1.Bounds = radioBound1 page.Canvas.DrawString("Travel", font, brush2, PointF(baseX + 20.0, baseY)) radioItem2 = PdfRadioButtonListItem("movie") radioBound2 = RectangleF(baseX + 70.0, baseY, 15.0, 15.0) radioItem2.Bounds = radioBound2 page.Canvas.DrawString("Movie", font, brush2, PointF(baseX + 90.0, baseY)) radioButtonListField.Items.Add(radioItem1) radioButtonListField.Items.Add(radioItem2) doc.Form.Fields.Add(radioButtonListField) baseY += 30.0 # Add a combobox page.Canvas.DrawString("Degree:", font, brush1, PointF(10.0, baseY)) cmbBounds = RectangleF(baseX, baseY, 150.0, 15.0) comboBoxField = PdfComboBoxField(page, "degree") comboBoxField.Bounds = cmbBounds comboBoxField.Items.Add(PdfListFieldItem("Bachelor", "bachelor")) comboBoxField.Items.Add(PdfListFieldItem("Master", "master")) comboBoxField.Items.Add(PdfListFieldItem("Doctor", "doctor")) comboBoxField.Font = font doc.Form.Fields.Add(comboBoxField) baseY += 30.0 # Add a button page.Canvas.DrawString("Button:", font, brush1, PointF(10.0, baseY)) btnBounds = RectangleF(baseX, baseY, 50.0, 15.0) buttonField = PdfButtonField(page, "button") buttonField.Bounds = btnBounds buttonField.Text = "Submit" buttonField.Font = font submitAction = PdfSubmitAction("https://www.e-iceblue.com/getformvalues.php") buttonField.Actions.MouseDown = submitAction doc.Form.Fields.Add(buttonField) # Save to file doc.SaveToFile("output/Form.pdf", FileFormat.PDF)
Fill in a PDF Form in Python
In order to fill in a form, the necessary steps include obtaining all form fields from the PDF document, locating a specific field based on its type and name, and subsequently entering or selecting a value from a predetermined list. The following are the detailed steps.
- Create a PdfDocument object.
- Load a sample PDF document using PdfDocument.LoadFromFile() method.
- Get the form from the document through PdfDocument.Form property.
- Get the form widget collection through PdfFormWidget.FieldsWidget property.
- Get a specific form field by its type and name.
- Enter a value or select a value from the predefined list for the field.
- Save the document to a PDF file using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import * from spire.pdf import * # Create a PdfDocument object doc = PdfDocument() # Load a PDF document contaning form fields doc.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Form.pdf") # Get form from the document form = doc.Form formWidget = PdfFormWidget(form) # Get form widget collection formWidgetCollection = formWidget.FieldsWidget # If the collection is nut null if formWidgetCollection.Count > 0: # Loop through the elements in the form widget collection for i in range(formWidgetCollection.Count): # Get a specific field field = formWidgetCollection.get_Item(i) # Determine if a field is a textbox if isinstance(field, PdfTextBoxFieldWidget): textBoxField = field if isinstance(field, PdfTextBoxFieldWidget) else None # Determine if the name of the text box is "name" if textBoxField.Name == "name": # Add text to the text box textBoxField.Text = "Jackson Green" # Choose an item from the list box if isinstance(field, PdfListBoxWidgetFieldWidget): listBoxField = field if isinstance(field, PdfListBoxWidgetFieldWidget) else None if listBoxField.Name == "country": index = [1] listBoxField.SelectedIndex = index # Choose an item from the combo box if isinstance(field, PdfComboBoxWidgetFieldWidget): comBoxField = field if isinstance(field, PdfComboBoxWidgetFieldWidget) else None if comBoxField.Name == "degree": items = [0] comBoxField.SelectedIndex = items # Select an item in the radio buttons if isinstance(field, PdfRadioButtonListFieldWidget): radioBtnField = field if isinstance(field, PdfRadioButtonListFieldWidget) else None if radioBtnField.Name == "hobbies": radioBtnField.SelectedIndex = 1 # Check the specified check box if isinstance(field, PdfCheckBoxWidgetFieldWidget): checkBoxField = field if isinstance(field, PdfCheckBoxWidgetFieldWidget) else None if checkBoxField.Name == "male": checkBoxField.Checked = True # Save the document doc.SaveToFile("output/FillForm.pdf") doc.Close()
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.