Excel files often contain a wealth of comments that can provide valuable context and insights. These comments may include important text notes, instructions, or even embedded images that can be incredibly useful for various data analysis and reporting tasks. Extracting this information from the comments can be a valuable step in unlocking the full potential of the data. In this article, we will demonstrate how to effectively extract text and images from comments in Excel files in Python using Spire.XLS for Python.
Install Spire.XLS for Python
This scenario requires Spire.XLS for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.XLS
If you are unsure how to install, please refer to this tutorial: How to Install Spire.XLS for Python on Windows
Extract Text from Comments in Excel in Python
You can get the text of comments using the ExcelCommentObject.Text property. The detailed steps are as follows.
- Create an object of the Workbook class.
- Load an Excel file using Workbook.LoadFromFile() method.
- Create a list to store the extracted comment text.
- Get the comments in the worksheet using Worksheet.Comments property.
- Traverse through the comments.
- Get the text of each comment using ExcelCommentObject.Text property and append it to the list.
- Save the content of the list to a text file.
- Python
from spire.xls import * from spire.xls.common import * # Create a Workbook object workbook = Workbook() # Load an Excel file workbook.LoadFromFile("Comments.xlsx") # Get the first worksheet worksheet = workbook.Worksheets[0] # Create a list to store the comment text comment_text = [] # Get all the comments in the worksheet comments = worksheet.Comments # Extract the text from each comment and add it to the list for i, comment in enumerate(comments, start=1): comment_text.append(f"Comment {i}:") text = comment.Text comment_text.append(text) comment_text.append("") # Write the comment text to a file with open("comments.txt", "w", encoding="utf-8") as file: file.write("\n".join(comment_text))
Extract Images from Comments in Excel in Python
To get the images embedded in Excel comments, you can use the ExcelCommentObject.Fill.Picture property. The detailed steps are as follows.
- Create an object of the Workbook class.
- Load an Excel file using Workbook.LoadFromFile() method.
- Get a specific comment in the worksheet using Worksheet.Comments[index] property.
- Get the embedded image in the comment using ExcelCommentObject.Fill.Picture property.
- Save the image to an image file.
- Python
from spire.xls import * from spire.xls.common import * # Create a Workbook object workbook = Workbook() # Load an Excel file workbook.LoadFromFile("ImageComment.xlsx") # Get the first worksheet worksheet = workbook.Worksheets[0] # Get a specific comment in the worksheet comment = worksheet.Comments[0] # Extract the image from the comment and save it to an image file image = comment.Fill.Picture image.Save("CommentImage/Comment.png")
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.