Java: Get Text Between Two Comment Marks in Word

Comments are used to provide additional information or draw attention to something in the document. Sometimes, the text being commented is also useful, and you may want to extract it for other purposes. In this article, you will learn how to extract the text between two comment marks in a Word document using Spire.Doc for Java.

Install Spire.Doc for Java

First of all, you're required to add the Spire.Doc.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.doc</artifactId>
        <version>12.11.0</version>
    </dependency>
</dependencies>
    

Get Text Between Two Comment Marks in Word

To get the text between the start mark (represented by CommentMarkStart class) and the end mark (represented by CommentMarkEnd class), you’ll need to get the indexes of the comment marks. These indexes specify the position of the text being marked in a paragraph. The following are the detailed steps to get text inside two comment marks in a Word document.

  • Create a Document object, and load a sample Word document using Document.loadFromFile() method.
  • Get the first comment using Document.getComments().get() method.
  • Get the start mark and end mark of the comment.
  • Get the start mark's index and the end mark's index in the owner paragraph.
  • Get the text range between the indexes, and then get the text of the text range using TextRage.getText() method.
  • Java
import com.spire.doc.Document;
import com.spire.doc.documents.CommentMark;
import com.spire.doc.documents.Paragraph;
import com.spire.doc.fields.Comment;
import com.spire.doc.fields.TextRange;

public class GetTextInsideCommentMarkers {

    public static void main(String[] args) {

        //Create a Document object
        Document doc = new Document();

        //Load the sample Word document
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input.docx");

        //Get the first comment
        Comment comment = doc.getComments().get(0);

        //Get the start mark and end mark of the comment
        Paragraph para = comment.getOwnerParagraph();
        CommentMark start = comment.getCommentMarkStart();
        CommentMark end = comment.getCommentMarkEnd();

        //Get the start mark’s index and the end mark’s index respectively
        int indexOfStart = para.getChildObjects().indexOf(start);
        int indexOfEnd = para.getChildObjects().indexOf(end);

        //Declare a String variable
        String textMarked = "";

        //Loop through the numbers between two indexes
        for (int i = indexOfStart + 1; i < indexOfEnd; i++) {
            if (para.getChildObjects().get(i) instanceof TextRange) {

                //Get the text range specified by the index
                TextRange range = (TextRange) para.getChildObjects().get(i);

                //Get text from the text range
                textMarked += range.getText();
            }
        }

        //Print out the text being marked
        System.out.println(textMarked);
    }
}

Java: Get Text Between Two Comment Marks in Word

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.