OLE (Object Linking and Embedding) objects in Word are files or data from other applications that can be inserted into a document. These objects can be edited and updated within Word, allowing you to seamlessly integrate content from various programs, such as Excel spreadsheets, PowerPoint presentations, or even multimedia files like images, audio, or video. In this article, we will introduce how to insert and extract OLE objects in a Word document in Python using Spire.Doc for Python.
Install Spire.Doc for Python
This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.Doc
If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Windows
Insert OLE Objects in Word in Python
Spire.Doc for Python provides the Paragraph.AppendOleObject(pathToFile:str, olePicture:DocPicture, type:OleObjectType) method to embed OLE objects in a Word document. The detailed steps are as follows.
- Create an object of the Document class.
- Load a Word document using the Document.LoadFromFile() method.
- Get a specific section using the Document.Sections.get_Item(index) method.
- Add a paragraph to the section using the Section.AddParagraph() method.
- Create an object of the DocPicture class.
- Load an image that will be used as the icon of the OLE object using the DocPicture.LoadImage() method and then set image width and height.
- Append an OLE object to the paragraph using the Paragraph.AppendOleObject(pathToFile:str, olePicture:DocPicture, type:OleObjectType) method.
- Save the result file using the Document.SaveToFile() method.
The following code example shows how to embed an Excel spreadsheet, a PDF file, and a PowerPoint presentation in a Word document using Spire.Doc for Python:
- Python
from spire.doc import * from spire.doc.common import * # Create an object of the Document class doc = Document() # Load a Word document doc.LoadFromFile("Example.docx") # Get the first section section = doc.Sections.get_Item(0) # Add a paragraph to the section para1 = section.AddParagraph() para1.AppendText("Excel File: ") # Load an image which will be used as the icon of the OLE object picture1 = DocPicture(doc) picture1.LoadImage("Excel-Icon.png") picture1.Width = 50 picture1.Height = 50 # Append an OLE object (an Excel spreadsheet) to the paragraph para1.AppendOleObject("Budget.xlsx", picture1, OleObjectType.ExcelWorksheet) # Add a paragraph to the section para2 = section.AddParagraph() para2.AppendText("PDF File: ") # Load an image which will be used as the icon of the OLE object picture2 = DocPicture(doc) picture2.LoadImage("PDF-Icon.png") picture2.Width = 50 picture2.Height = 50 # Append an OLE object (a PDF file) to the paragraph para2.AppendOleObject("Report.pdf", picture2, OleObjectType.AdobeAcrobatDocument) # Add a paragraph to the section para3 = section.AddParagraph() para3.AppendText("PPT File: ") # Load an image which will be used as the icon of the OLE object picture3 = DocPicture(doc) picture3.LoadImage("PPT-Icon.png") picture3.Width = 50 picture3.Height = 50 # Append an OLE object (a PowerPoint presentation) to the paragraph para3.AppendOleObject("Plan.pptx", picture3, OleObjectType.PowerPointPresentation) doc.SaveToFile("InsertOLE.docx", FileFormat.Docx2013) doc.Close()
Extract OLE Objects from Word in Python
To extract OLE objects from a Word document, you first need to locate the OLE objects within the document. Once located, you can determine the file format of each OLE object. Finally, you can save the data of each OLE object to a file in its native file format. The detailed steps are as follows.
- Create an instance of the Document class.
- Load a Word document using the Document.LoadFromFile() method.
- Iterate through all sections of the document.
- Iterate through all child objects in the body of each section.
- Identify the paragraphs within each section.
- Iterate through the child objects in each paragraph.
- Locate the OLE object within the paragraph.
- Determine the file format of the OLE object.
- Save the data of the OLE object to a file in its native file format.
The following code example shows how to extract the embedded Excel spreadsheet, PDF file, and PowerPoint presentation from a Word document using Spire.Doc for Python:
- Python
from spire.doc import * from spire.doc.common import * # Create an object of the Document class doc = Document() # Load a Word document doc.LoadFromFile("InsertOLE.docx") i = 1 # Iterate through all sections of the Word document for k in range(doc.Sections.Count): sec = doc.Sections.get_Item(k) # Iterate through all child objects in the body of each section for j in range(sec.Body.ChildObjects.Count): obj = sec.Body.ChildObjects.get_Item(j) # Check if the child object is a paragraph if isinstance(obj, Paragraph): par = obj if isinstance(obj, Paragraph) else None # Iterate through the child objects in the paragraph for m in range(par.ChildObjects.Count): o = par.ChildObjects.get_Item(m) # Check if the child object is an OLE object if o.DocumentObjectType == DocumentObjectType.OleObject: ole = o if isinstance(o, DocOleObject) else None s = ole.ObjectType # Check if the OLE object is a PDF file if s.startswith("AcroExch.Document"): ext = ".pdf" # Check if the OLE object is an Excel spreadsheet elif s.startswith("Excel.Sheet"): ext = ".xlsx" # Check if the OLE object is a PowerPoint presentation elif s.startswith("PowerPoint.Show"): ext = ".pptx" else: continue # Write the data of OLE into a file in its native format with open(f"Output/OLE{i}{ext}", "wb") as file: file.write(ole.NativeData) i += 1 doc.Close()
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.