Hyperlinks in PDF documents are commonly used tools for navigating to internal or external related information. However, these links need to be accurate and up-to-date in order to be effective. Document editors are supposed to have the power to change or delete hyperlinks to update outdated references, rectify errors, comply with evolving web standards, or enhance accessibility. This article will demonstrate how to use Spire.PDF for Python to modify or remove hyperlinks in PDF documents to ensure accurate information dissemination, seamless navigation, and inclusive user experience.
- Update the Target Address of Hyperlinks in PDF Using Python
- Remove Hyperlinks in PDF Documents Using Python
Install Spire.PDF for Python
This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.PDF
If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows
Update the Target Address of Hyperlinks in PDF Using Python
In PDF documents, hyperlinks are annotations displayed on the linked content on a page. Therefore, to modify hyperlinks in PDF documents, it is needed to retrieve all annotations on a page through PdfPageBase.AnnotationsWidget property. Then, a specific hyperlink annotation can be obtained from the annotation list and the target address can be updated through PdfTextWebLinkAnnotationWidget.Url property. The detailed steps are as follows:
- Create an object of PdfDocument class and load a PDF document using PdfDocument.LoadFromFile() method.
- Get a page of the document using PdfDocument.Pages.get_Item() method.
- Get all annotations on the page through PdfPageBase.AnnotationsWidget property.
- Get a hyperlink annotation and cast it to a PdfTextWebLinkAnnotationWidget object.
- Set a new target address for the hyperlink annotation through PdfTextWebLinkAnnotationWidget.Url property.
- Save the document using PdfDocument.SaveToFile() method.
- Python
from spire.pdf.common import * from spire.pdf import * # Create an object of PdfDocument class and load a PDF document pdf = PdfDocument() pdf.LoadFromFile("Sample.pdf") # Get the first page of the document page = pdf.Pages.get_Item(0) # Get all annotations on the page widgetCollection = page.AnnotationsWidget # Get the second hyperlink annotation annotation = widgetCollection.get_Item(1) # Cast the hyperlink annotation to a PdfTextWebLinkAnnotationWidget object link = PdfTextWebLinkAnnotationWidget(annotation) # Set a new target address for the second hyperlink link.Url = "https://www.mcafee.com/learn/understanding-trojan-viruses-and-how-to-get-rid-of-them/" #Save the document pdf.SaveToFile("output/ModifyPDFHyperlink.pdf") pdf.Close()
Remove Hyperlinks in PDF Documents Using Python
Spire.PDF for Python enables developers to effortlessly remove specific hyperlinks on a page using the PdfPageBase.AnnotationsWidget.RemoveAt() method. Additionally, developers can also iterate through each page and its annotations to identify and eliminate all hyperlink annotations in the entire PDF document with the help of Spire.PDF for Python. The detailed steps are as follows:
- Create an object of PdfDocument class and load a PDF document using PdfDocument.LoadFromFile() method.
- To remove a specific hyperlink, get a page in the document using PdfDocument.Pages.get_Item() method and remove the hyperlink annotation using PdfPageBase.AnnotationsWidget.RemoveAt() method.
- To remove all hyperlinks in the document, loop through the pages in the document to get the annotations on each page through PdfPageBase.AnnotationsWidget property.
- Loop through the annotations to check if each annotation is an instance of PdfTextWebLinkAnnotationWidget class. If it is, remove it using PdfAnnotationCollection.Remove() method.
- Save the document using PdfDocument.SaveToFile() method.
- Python
from spire.pdf import * from spire.pdf.common import * # # Create an object of PdfDocument class and load a PDF document pdf = PdfDocument() pdf.LoadFromFile("Sample.pdf") # Remove the first hyperlink on the first page #page = pdf.Pages.get_Item(0) #page.AnnotationsWidget.RemoveAt(0) # Remove all hyperlinks # Loop through the pages in the document for j in range(pdf.Pages.Count): # Get each page page = pdf.Pages.get_Item(j) # Get the annotations on each page annotations = page.AnnotationsWidget # Check if there is any annotations on a page if annotations.Count > 0: # Loop through the annotations i = annotations.Count - 1 while i >=0: # Get an annotation annotation = annotations.get_Item(i) # Check if each annotation is a hyperlink if isinstance(annotation, PdfTextWebLinkAnnotationWidget): # Remove hyperlink annotations annotations.Remove(annotation) i -= 1 # Save the document pdf.SaveToFile("output/RemovePDFHyperlink.pdf") pdf.Close()
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.