Finding and highlighting text within a PDF document is a crucial task for many individuals and organizations. Whether you're a student conducting research, a professional reviewing contracts, or an archivist organizing digital records, the ability to quickly locate and emphasize specific information is invaluable.
In this article, you will learn how to find and highlight text in a PDF document in Java using the Spire.PDF for Java library.
- Find and Highlight Text in a Specific Page in Java
- Find and Highlight Text in a Rectangular Area in Java
- Find and Highlight Text in an Entire PDF Document in Java
- Find and Highlight Text in PDF Using a Regular Expression in Java
Install Spire.PDF for Java
First of all, you're required to add the Spire.Pdf.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.
<repositories> <repository> <id>com.e-iceblue</id> <name>e-iceblue</name> <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url> </repository> </repositories> <dependencies> <dependency> <groupId>e-iceblue</groupId> <artifactId>spire.pdf</artifactId> <version>10.10.7</version> </dependency> </dependencies>
Find and Highlight Text in a Specific Page in Java
In Spire.PDF for Java, you can utilize the PdfTextFinder class to locate specific text within a page. Prior to executing the find operation, you can set the search options such as WholeWord and IgnoreCase by utilizing the PdfTextFinder.getOptions.setTextFindParameter() method. Once the text is located, you can apply highlighting to visually differentiate the text.
The following are the steps to find and highlight text in a specific page in PDF using Java.
- Create a PdfDocument object.
- Load a PDF file from a given path.
- Get a specific page from the document.
- Create a PdfTextFinder object based on the page.
- Specify search options using PdfTextFinder.getOptions().setTextFindParameter() method.
- Find all instance of searched text using PdfTextFinder.find() method.
- Iterate through the find results, and highlight each instance using PdfTextFragment.highlight() method.
- Save the document to a different PDF file.
- Java
import com.spire.ms.System.Collections.Generic.List; import com.spire.pdf.FileFormat; import com.spire.pdf.PdfDocument; import com.spire.pdf.PdfPageBase; import com.spire.pdf.texts.PdfTextFinder; import com.spire.pdf.texts.PdfTextFragment; import com.spire.pdf.texts.TextFindParameter; import java.awt.*; import java.util.EnumSet; public class FindAndHighlightTextInPage { public static void main(String[] args) { // Create a PdfDocument object PdfDocument doc = new PdfDocument(); // Load a PDF file doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf"); // Get a specific page PdfPageBase page = doc.getPages().get(0); // Create a PdfTextFinder object based on the page PdfTextFinder finder = new PdfTextFinder(page); // Specify the find options finder.getOptions().setTextFindParameter(EnumSet.of(TextFindParameter.WholeWord)); finder.getOptions().setTextFindParameter(EnumSet.of(TextFindParameter.IgnoreCase)); // Find the instances of the specified text List<PdfTextFragment> results = finder.find("MySQL"); // Iterate through the find results for (PdfTextFragment textFragment: results) { // Highlight text textFragment.highLight(Color.LIGHT_GRAY); } // Save to a different PDF file doc.saveToFile("output/HighlightTextInPage.pdf", FileFormat.PDF); // Dispose resources doc.dispose(); } }
Find and Highlight Text in a Rectangular Area in Java
To draw attention to a specific section or piece of information within a document, users can find and highlight specified text within a rectangular area of a page. The rectangular region can be defined by utilizing the PdfTextFinder.getOptions().setFindArea() method.
The following are the steps to find and highlight text in a rectangular area of a PDF page using Java.
- Create a PdfDocument object.
- Load a PDF file from a given path.
- Get a specific page from the document.
- Create a PdfTextFinder object based on the page.
- Specify search options using PdfTextFinder.getOptions().setTextFindParameter() method.
- Find all instance of searched text within the rectangular area using PdfTextFinder.find() method.
- Iterate through the find results, and highlight each instance using PdfTextFragment.fighlight() method.
- Save the document to a different PDF file.
- Java
import com.spire.ms.System.Collections.Generic.List; import com.spire.pdf.FileFormat; import com.spire.pdf.PdfDocument; import com.spire.pdf.PdfPageBase; import com.spire.pdf.texts.PdfTextFinder; import com.spire.pdf.texts.PdfTextFragment; import com.spire.pdf.texts.TextFindParameter; import java.awt.*; import java.awt.geom.Rectangle2D; import java.util.EnumSet; public class FindAndHighlightTextInRectangle { public static void main(String[] args) { // Create a PdfDocument object PdfDocument doc = new PdfDocument(); // Load a PDF file doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf"); // Get a specific page PdfPageBase page = doc.getPages().get(0); // Create a PdfTextFinder object based on the page PdfTextFinder finder = new PdfTextFinder(page); // Specify a rectangular area for searching text finder.getOptions().setFindArea(new Rectangle2D.Float(0,0,841,180)); // Specify other options finder.getOptions().setTextFindParameter(EnumSet.of(TextFindParameter.WholeWord)); finder.getOptions().setTextFindParameter(EnumSet.of(TextFindParameter.IgnoreCase)); // Find the instances of the specified text in the rectangular area List<PdfTextFragment> results = finder.find("MySQL"); // Iterate through the find results for (PdfTextFragment textFragment: results) { // Highlight text textFragment.highLight(Color.LIGHT_GRAY); } // Save to a different PDF file doc.saveToFile("output/HighlightTextInRectangle.pdf", FileFormat.PDF); // Dispose resources doc.dispose(); } }
Find and Highlight Text in an Entire PDF Document in Java
The first code example provides a demonstration of how to highlight text on a specific page. To highlight text throughout the entire document, you can traverse each page of the document, perform the search operation, and apply the highlighting to the identified text.
The steps to find and highlight text in an entire PDF document using Java are as follows.
- Create a PdfDocument object.
- Load a PDF file from a given path.
- Iterate through each page in the document.
- Create a PdfTextFinder object based on a certain page.
- Specify search options using PdfTextFinder.getOptions().setTextFindParameter() method.
- Find all instance of searched text using PdfTextFinder.find() method.
- Iterate through the find results, and highlight each instance using PdfTextFragment.fighlight() method.
- Save the document to a different PDF file.
- Java
import com.spire.ms.System.Collections.Generic.List; import com.spire.pdf.FileFormat; import com.spire.pdf.PdfDocument; import com.spire.pdf.PdfPageBase; import com.spire.pdf.texts.PdfTextFinder; import com.spire.pdf.texts.PdfTextFragment; import com.spire.pdf.texts.TextFindParameter; import java.awt.*; import java.util.EnumSet; public class FindAndHighlightTextInDocument { public static void main(String[] args) { // Create a PdfDocument object PdfDocument doc = new PdfDocument(); // Load a PDF file doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf"); // Iterate through the pages in the PDF file for (Object pageObj : doc.getPages()) { // Get a specific page PdfPageBase page = (PdfPageBase) pageObj; // Create a PdfTextFinder object based on the page PdfTextFinder finder = new PdfTextFinder(page); // Specify the find options finder.getOptions().setTextFindParameter(EnumSet.of(TextFindParameter.WholeWord)); finder.getOptions().setTextFindParameter(EnumSet.of(TextFindParameter.IgnoreCase)); // Find the instances of the specified text List<PdfTextFragment> results = finder.find("MySQL"); // Iterate through the find results for (PdfTextFragment textFragment: results) { // Highlight text textFragment.highLight(Color.LIGHT_GRAY); } } // Save to a different PDF file doc.saveToFile("output/HighlightTextInDocument.pdf", FileFormat.PDF); // Dispose resources doc.dispose(); } }
Find and Highlight Text in PDF Using a Regular Expression in Java
When you're looking for specific text within a document, regular expressions offer enhanced flexibility and control over the search criteria. To make use of a regular expression, you'll need to set the TextFindParameter as Regex and supply the desired regular expression pattern as input to the find()method.
The following are the steps to find and highlight text in PDF using a regular expression using Java.
- Create a PdfDocument object.
- Load a PDF file from a given path.
- Iterate through each page in the document.
- Create a PdfTextFinder object based on a certain page.
- Set the TextFindParameter as Regex using PdfTextFinder.getOptions().setTextFindParameter() method.
- Create a regular expression pattern that matches the specific text you are searching for.
- Find all instance of the searched text using PdfTextFinder.find() method.
- Iterate through the find results, and highlight each instance using PdfTextFragment.fighlight() method.
- Save the document to a different PDF file.
- Java
import com.spire.ms.System.Collections.Generic.List; import com.spire.pdf.FileFormat; import com.spire.pdf.PdfDocument; import com.spire.pdf.PdfPageBase; import com.spire.pdf.texts.PdfTextFinder; import com.spire.pdf.texts.PdfTextFragment; import com.spire.pdf.texts.TextFindParameter; import java.awt.*; import java.util.EnumSet; public class FindAndHighlightTextUsingRegex { public static void main(String[] args) { // Create a PdfDocument object PdfDocument doc = new PdfDocument(); // Load a PDF file doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf"); // Iterate through the pages in the PDF file for (Object pageObj : doc.getPages()) { // Get a specific page PdfPageBase page = (PdfPageBase) pageObj; // Create a PdfTextFinder object based on the page PdfTextFinder finder = new PdfTextFinder(page); // Specify the search model as Regex finder.getOptions().setTextFindParameter(EnumSet.of(TextFindParameter.Regex)); // Define a regular expression pattern that matches a letter starting with 'R' and ending with 'S' String pattern = "\\bR\\w*S\\b"; // Find the text that conforms to a regular expression List<PdfTextFragment> results = finder.find(pattern); // Iterate through the find results for (PdfTextFragment textFragment: results) { // Highlight text textFragment.highLight(Color.LIGHT_GRAY); } } // Save to a different PDF file doc.saveToFile("output/HighlightTextUsingRegex.pdf", FileFormat.PDF); // Dispose resources doc.dispose(); } }
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.