Python: Read or Remove Document Properties in Excel

Document properties provide additional information about an Excel file, such as author, title, subject, and other metadata associated with the file. Retrieving these properties from Excel can help users gain insight into the file content and history, enabling better organization and management of files. At times, users may also need to remove document properties to protect the privacy and confidentiality of the information contained in the file. In this article, you will learn how to read or remove document properties in Excel in Python using Spire.XLS for Python.

Install Spire.XLS for Python

This scenario requires Spire.XLS for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.XLS

If you are unsure how to install, please refer to this tutorial: How to Install Spire.XLS for Python on Windows

Read Standard and Custom Document Properties in Excel in Python

Excel properties are divided into two main categories:

  • Standard Properties: These are predefined properties that are built into Excel files. They typically include basic details about the file such as title, subject, author, keywords, etc.
  • Custom Properties: These are user-defined attributes that can be added to Excel to track additional information about the file based on your specific needs.

Spire.XLS for Python allows to read both the standard and custom document properties of an Excel file. The following are the detailed steps:

  • Create a Workbook instance.
  • Load an Excel file using Workbook.LoadFromFile() method.
  • Create a StringBuilder instance.
  • Get a collection of all standard document properties using Workbook.DocumentProperties property.
  • Get specific standard document properties using the properties of the BuiltInDocumentProperties class and append them to the StringBuilder instance.
  • Get a collection of all custom document properties using Workbook.CustomDocumentProperties property.
  • Iterate through the collection.
  • Get the name, type, and value of each custom document property using ICustomDocumentProperties[].Name, ICustomDocumentProperties[].PropertyType and ICustomDocumentProperties[].Value properties.
  • Determine the specific property type, and then convert the property value to the value of the corresponding data type.
  • Append the property name and converted property value to the StringBuilder instance using StringBuilde.append() method.
  • Write the content of the StringBuilder instance into a txt file.
  • Python
from spire.xls import *
from spire.xls.common import *

def AppendAllText(fname: str, text: List[str]):
    fp = open(fname, "w")
    for s in text:
        fp.write(s + "\n")
    fp.close()

inputFile = "Budget Template.xlsx"
outputFile = "GetExcelProperties.txt"

# Create a Workbook instance
workbook = Workbook()

# Load an Excel document from disk
workbook.LoadFromFile(inputFile)

# Create a StringBuilder instance
builder = []

# Get a collection of all standard document properties
standardProperties = workbook.DocumentProperties

# Get specific standard properties and append them to the StringBuilder instance
builder.append("Standard Document Properties:")
builder.append("Title: " + standardProperties.Title)
builder.append("Subject: " + standardProperties.Subject)
builder.append("Category: " + standardProperties.Category)
builder.append("Keywords: " + standardProperties.Keywords)
builder.append("Comments: " + standardProperties.Comments)
builder.append("")

# Get a collection of all custom document properties
customProperties = workbook.CustomDocumentProperties

builder.append("Custom Properties:")
# Iterate through the collection
for i in range(len(customProperties)):
    
    # Get the name, type, and value of each custom document property
    name = customProperties[i].Name
    type = customProperties[i].PropertyType
    obj = customProperties[i].Value

    # Determine the specific property type, and then convert the property value to the value of the corresponding data type
    value = None
    if type == PropertyType.Double:
        value = Double(obj).Value
    elif type == PropertyType.DateTime:
        value = DateTime(obj).ToShortDateString()
    elif type == PropertyType.Bool:
        value = Boolean(obj).Value
    elif type == PropertyType.Int:
        value = Int32(obj).Value
    elif type == PropertyType.Int32:
        value = Int32(obj).Value
    else:
        value = String(obj).Value

    # Append the property name and converted property value to the StringBuilder instance   
    builder.append(name + ": " + str(value))

# Write the content of the StringBuilder instance into a text file
AppendAllText(outputFile, builder)
workbook.Dispose()

Python: Read or Remove Document Properties in Excel

Remove Standard and Custom Document Properties in Excel in Python

You can easily delete standard document properties from an Excel file by setting their values as empty. For custom document properties, you can use the ICustomDocumentProperties.Remove() method to delete them. The following are the detailed steps:

  • Create a Workbook instance.
  • Load a sample Excel file using Workbook.LoadFromFile() method.
  • Get a collection of all standard document properties using Workbook.DocumentProperties property.
  • Set the values of specific standard document properties as empty through the corresponding properties of the BuiltInDocumentProperties class.
  • Get a collection of all custom document properties using Workbook.CustomDocumentProperties property.
  • Iterate through the collection.
  • Delete each custom property from the collection by its name using ICustomDocumentProperties.Remove() method.
  • Save the result file using Workbook.SaveToFile() method.
  • Python
from spire.xls import *
from spire.xls.common import *

inputFile = "Budget Template.xlsx"
outputFile = "RemoveExcelProperties.xlsx"

# Create a Workbook instance
workbook = Workbook()

# Load an Excel document from disk
workbook.LoadFromFile(inputFile)

# Get a collection of all standard document properties
standardProperties = workbook.DocumentProperties

# Set the value of each standard document property as empty
standardProperties.Title = ""
standardProperties.Subject = ""
standardProperties.Category = ""
standardProperties.Keywords = ""
standardProperties.Comments = ""

# Get a collection of all custom document properties
customProperties = workbook.CustomDocumentProperties

# Iterate through the collection
for i in range(len(customProperties) - 1, -1, -1):
    # Delete each custom document property from the collection by its name
    customProperties.Remove(customProperties[i].Name)

# Save the result file
workbook.SaveToFile(outputFile, ExcelVersion.Version2016)
workbook.Dispose()

Python: Read or Remove Document Properties in Excel

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.