Python: Extract or Update Textboxes in a Word Document
Textboxes in a Word document serve as versatile containers for text, enabling users to enhance layout and design. They allow for the separation of content from the main body, making documents more visually appealing and organized. Extracting or updating textboxes can be essential for improving document efficiency, ensuring information is current, and facilitating data analysis.
In this article, you will learn how to extract or update textboxes in a Word document using Python and Spire.Doc for Python.
Install Spire.Doc for Python
This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.Doc
If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows
Extract Text from a Textbox in Word
Using Spire.Doc for Python, you can access a specific text box in a document by utilizing the Document.TextBoxes[index] property. After retrieving the text box, you can iterate through its child objects to identify whether each one is a paragraph or a table. If the object is a paragraph, you can retrieve its text using the Paragraph.Text property. In cases where the object is a table, you will need to loop through each cell to extract text from every individual cell within that table.
The steps to extract text from a text box in a Word document are as follows:
- Create a Document object.
- load a Word file by using Document.LoadFromFile() method.
- Access a specific text box using Document.TextBoxes[index] property.
- Iterate through the child objects within the text box.
- Determine if a child object is a paragraph. If it is, retrieve the text from the paragraph using Paragraph.Text property.
- Check if a child object is a table. If so, iterate through the cells in the table to extract text from each cell.
- Python
from spire.doc import * from spire.doc.common import * # Create a Document object document = Document() # Load a Word file document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx") # Get a specific textbox textBox = document.TextBoxes[0] with open('ExtractedText.txt','w') as sw: # Iterate through the child objects in the textbox for i in range(textBox.ChildObjects.Count): # Get a specific child object object = textBox.ChildObjects.get_Item(i) # Determine if the child object is paragraph if object.DocumentObjectType == DocumentObjectType.Paragraph: # Write paragraph text to txt file sw.write((object if isinstance(object, Paragraph) else None).Text + "\n") # Determine if the child object is table if object.DocumentObjectType == DocumentObjectType.Table: table = object if isinstance(object, Table) else None for i in range(table.Rows.Count): row = table.Rows[i] for j in range(row.Cells.Count): cell = row.Cells[j] for k in range(cell.Paragraphs.Count): paragraph = cell.Paragraphs.get_Item(k) # Write paragrah text of a specific cell to txt file sw.write(paragraph.Text + "\n") # Dispose resources document.Dispose()
Update Text in a Textbox in Word
To update a textbox in a Word document, start by clearing its existing content with the TextBox.ChildObjects.Clear() method. This action removes all child objects, including any paragraphs or tables currently contained within the textbox. After clearing the content, you can add a new paragraph to the text box. Once the paragraph is created, set its text to the desired value.
The steps to update a textbox in a Word document are as follows:
- Create a Document object.
- Load a Word file using Document.LoadFromFile() method.
- Get a specific textbox using Document.TextBoxes[index] property
- Remove existing content of the textbox using TextBox.ChildObjects.Clear() method.
- Add a paragraph to the textbox using TextBox.Body.AddParagraph() method.
- Add text to the paragraph using Paragraph.AppendText() method.
- Save the document to a different Word file.
- Python
from spire.doc import * from spire.doc.common import * # Create a Document object document = Document() # Load a Word file document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.docx") # Get a specific textbox textBox = document.TextBoxes[0] # Remove child objects of the textbox textBox.ChildObjects.Clear() # Add a new paragraph to the textbox paragraph = textBox.Body.AddParagraph() # Set line spacing paragraph.Format.LineSpacing = 15.0 # Add text to the paragraph textRange = paragraph.AppendText("The text in this textbox has been updated.") # Set font size textRange.CharacterFormat.FontSize = 15.0 # Save the document to a different Word file document.SaveToFile("UpdateTextbox.docx", FileFormat.Docx2019); # Dispose resources document.Dispose()
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Python: Add or Remove Textboxes in a Word Document
Textboxes are versatile tools in Microsoft Word, allowing you to insert and position text or other elements anywhere on a page, giving you the power to create eye-catching flyers, brochures, or reports. Whether you're looking to emphasize a particular section of text, place captions near images, or simply add a decorative touch, the capacity to manipulate textboxes offers a practical and aesthetic advantage in document design. In this article, you will learn how to add or remove textboxes in a Word document in Python using Spire.Doc for Python.
Install Spire.Doc for Python
This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.Doc
If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows
Add a Textbox to a Word Document in Python
Spire.Doc for Python provides the Paragraph.AppendTextBox() method to insert a textbox in a specified paragraph. The content and formatting of the textbox can be set through the properties under the TextBox object. The following are the detailed steps.
- Create a Document object.
- Load a Word document using Document.LoadFromFile() method.
- Get the first section and add a paragraph to the section using Section.AddParagraph() method.
- Add a text box to the paragraph using Paragraph.AppendTextBox() method.
- Get the format of the textbox using TextBox.Format property, and then set the textbox's wrapping type, position, border color and fill color using the properties of TextBoxFormat Class.
- Add a paragraph to the textbox using TextBox.Body.AddParagraph() method.
- Add an image to the paragraph using Paragraph.AppendPicture() method.
- Add text to the textbox using Paragraph.AppendText() method
- Save the document to a different file using Document.SaveToFile() method.
- Python
from spire.doc import * from spire.doc.common import * # Create a Document object document = Document() # Load a Word document document.LoadFromFile("C:/Users/Administrator/Desktop/input3.docx") # Insert a textbox and set its wrapping style textBox = document.Sections[0].Paragraphs[0].AppendTextBox(135, 300) textBox.Format.TextWrappingStyle = TextWrappingStyle.Square # Set the position of the textbox textBox.Format.HorizontalOrigin = HorizontalOrigin.RightMarginArea textBox.Format.HorizontalPosition = -145.0 textBox.Format.VerticalOrigin = VerticalOrigin.Page textBox.Format.VerticalPosition = 120.0 # Set the border style and fill color of the textbox textBox.Format.LineColor = Color.get_DarkBlue() textBox.Format.FillColor = Color.get_LightGray() # Insert an image to textbox as a paragraph para = textBox.Body.AddParagraph(); picture = para.AppendPicture("C:/Users/Administrator/Desktop/Wikipedia_Logo.png") # Set alignment for the paragraph para.Format.HorizontalAlignment = HorizontalAlignment.Center # Set the size of the inserted image picture.Height = 90.0 picture.Width = 90.0 # Insert text to textbox as the second paragraph textRange = para.AppendText("Wikipedia is a free encyclopedia, written collaboratively by the people who use it. " + "Since 2001, it has grown rapidly to become the world's largest reference website, " + "with 6.7 million articles in English attracting billions of views every month.") # Set alignment for the paragraph para.Format.HorizontalAlignment = HorizontalAlignment.Center # Set the font of the text textRange.CharacterFormat.FontName = "Times New Roman" textRange.CharacterFormat.FontSize = 12.0 textRange.CharacterFormat.Italic = True # Save the result file document.SaveToFile("output/AddTextBox.docx", FileFormat.Docx)
Remove a Textbox from a Word Document in Python
Spire.Doc for Python provides the Document.TextBoxes.RemoveAt(int index) method to delete a specified textbox. To delete all textboxes from a Word document, you can use the Document.TextBoxes.Clear() method. The following example shows how to remove the first textbox from a Word document.
- Create a Document object.
- Load a Word document using Document.LoadFromFile() method.
- Remove the first text box using Document.TextBoxes.RemoveAt(int index) method.
- Save the document to another file using Document.SaveToFile() method.
- Python
from spire.doc import * from spire.doc.common import * # Create a Document object document = Document() # Load a Word document document .LoadFromFile("C:/Users/Administrator/Desktop/TextBox.docx") # Remove the first textbox document .TextBoxes.RemoveAt(0) # Remove all textboxes # document.TextBoxes.Clear() # Save the result document document.SaveToFile("output/RemoveTextbox.docx", FileFormat.Docx)
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.