C#: Convert Word to Markdown

2024-10-31 06:06:12 Written by  support iceblue
Rate this item
(0 votes)

Markdown, with its lightweight syntax, offers a streamlined approach to web content creation, collaboration, and document sharing, particularly in environments where tools like Git or Markdown-friendly editors are prevalent. By converting Word documents to Markdown files, users can enhance their productivity, facilitate easier version control, and ensure compatibility across different systems and platforms. In this article, we will explore the process of converting Word documents to Markdown files using Spire.Doc for .NET, providing simple C# code examples.

Install Spire.Doc for .NET

To begin with, you need to add the DLL files included in the Spire.Doc for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.Doc

Convert Word to Markdown with C#

Using Spire.Doc for .NET, we can convert a Word document to a Markdown file by loading the document using Document.LoadFromFile() method and then convert it to a Markdown file using Document.SaveToFile(filename: String, FileFormat.Markdown) method. The detailed steps are as follows:

  • Create an instance of Document class.
  • Load a Word document using Document.LoadFromFile() method.
  • Convert the document to a Markdown file using Document.SaveToFile(filename: String, FileFormat.Markdown) method.
  • Release resources.
  • C#
using Spire.Doc;

namespace WordToMarkdown
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create an instance of Document class
            Document doc = new Document();

            // Load a Word document
            doc.LoadFromFile("Sample.docx");

            // Convert the document to a Markdown file
            doc.SaveToFile("output/WordToMarkdown.md", FileFormat.Markdown);
            doc.Dispose();
        }
    }
}

C#: Convert Word to Markdown

Convert Word to Markdown Without Images

When using Spire.Doc for .NET to convert Word documents to Markdown files, images are stored in Base64 encoding by default, which can increase the file size and affect compatibility. To address this, we can remove the images during conversion, thereby reducing the file size and enhancing compatibility.

The following steps outline how to convert Word documents to Markdown files without images:

  • Create an instance of Document class.
  • Load a Word document using Document.LoadFromFile() method.
  • Iterate through the sections and then the paragraphs in the document.
  • Iterate through the document objects in the paragraphs:
    • Get a document object through Paragraph.ChildObjects[] property.
    • Check if it’s an instance of DocPicture class. If it is, remove it using Paragraph.ChildObjects.Remove(DocumentObject) method.
  • Convert the document to a Markdown file using Document.SaveToFile(filename: String, FileFormat.Markdown) method.
  • Release resources.
  • C#
using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;

namespace WordToMarkdownNoImage
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create an instance of Document class
            Document doc = new Document();

            // Load a Word document
            doc.LoadFromFile("Sample.docx");

            // Iterate through the sections in the document
            foreach (Section section in doc.Sections)
            {
                // Iterate through the paragraphs in the sections
                foreach (Paragraph paragraph in section.Paragraphs)
                {
                    // Iterate through the document objects in the paragraphs
                    for (int i = 0; i < paragraph.ChildObjects.Count; i++)
                    {
                        // Get a document object
                        DocumentObject docObj = paragraph.ChildObjects[i];
                        // Check if it is an instance of DocPicture class
                        if (docObj is DocPicture)
                        {
                            // Remove the DocPicture instance
                            paragraph.ChildObjects.Remove(docObj);
                        }
                    }
                }
            }

            // Convert the document to a Markdown file
            doc.SaveToFile("output/WordToMarkdownNoImage.md", FileFormat.Markdown);
            doc.Dispose();
        }
    }
}

C#: Convert Word to Markdown

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.