How to get text from word document in C#

2016-11-08 08:05:46 Written by  support iceblue
Rate this item
(0 votes)

Sometimes we only need to get the text from the word document for other use when we deal with the word documents with large amount of information. With the help of Spire.Doc, we have already demonstrated how to extract the text from the word document by traverse every paragraph on the word document and then append the text accordingly. This article will show you how to use the method of doc.GetText() to extract the text directly from the word documents with texts, images and tables. It is more convenient for developers to extract the text from the word document from code.

Firstly, view the sample word document which will be extracted the text firstly:

How to get text from word document in C#

Step 1: Create a word instance and load the source word document from file.

Document doc = new Document();
doc.LoadFromFile("Sample.docx");

Step 2: Invoke the doc.GetText() method to get all the texts from the word document.

string s = doc.GetText();

Step 3: Create a New TEXT File to Save Extracted Text.

File.WriteAllText("Extract.txt", s.ToString());

Effective screenshot after get all the text from the word document:

How to get text from word document in C#

Full codes:

using Spire.Doc;
using System.IO;
namespace GetText
{
   class WordText
 {
   public void GetText()
   {
     Document doc = new Document();
     doc.LoadFromFile("Sample.docx");

     string s = doc.GetText();

     File.WriteAllText("Extract.txt", s.ToString());

    }
 }
}

Additional Info

  • tutorial_title: Get text from word document in C#
Last modified on Thursday, 11 April 2024 01:15