Spire.Office 9.8.0 is released
We are excited to announce the release of Spire.Office 9.8.0. In this version, Spire.PDF supports converting HTML to PDF using the Chrome; Spire.XLS supports enabling revision mode and obtaining custom properties of worksheets; Spire.Presentation supports converting PowerPoint documents to Markdown files; Spire.OCR supports configuring OCR models, languages, and dependency libraries. Besides, a lot of known issues are fixed successfully in this version. More details are listed below.
In this version, the most recent versions of Spire.Doc, Spire.PDF, Spire.XLS, Spire.Presentation, Spire.Email, Spire.DocViewer, Spire.PDFViewer, Spire.Spreadsheet, Spire.OfficeViewer, Spire.DataExport, and Spire.Barcode are included.
DLL Versions:
- Spire.Doc.dll v12.8.12
- Spire.Pdf.dll v10.8.1
- Spire.XLS.dll v14.8.2
- Spire.Presentation.dll v9.8.3
- Spire.Barcode.dll v7.3.3
- Spire.Email.dll v6.6.0
- Spire.DocViewer.Forms.dll v8.8.1
- Spire.PdfViewer.Asp.dll v7.12.23
- Spire.PdfViewer.Forms.dll v7.12.23
- Spire.Spreadsheet.dll v7.5.2
- Spire.OfficeViewer.Forms.dll v8.7.15
- Spire.DataExport.dll 4.9.0
- Spire.DataExport.ResourceMgr.dll v2.1.0
Here is a list of changes made in this release
Spire.Doc
Category | ID | Description |
Bug | SPIREDOC-10644 | Fixes the issue that the image data was failed to be filled during mail merge. |
Bug | SPIREDOC-10219 | Fixes the issue that extra blank pages are generated when converting Word to OFD. |
Bug | SPIREDOC-10365 | Fixes the issue that the VBA was lost after saving Docm. |
Bug | SPIREDOC-10389 | Fixes the issue that the application threw the exception "Wrong Word version" when loading Word documents encrypted using the WPS tool. |
Bug | SPIREDOC-10425 | Fixes the issue that the application threw the exception 'System. ArgumentNullException' when converting Word to PDF. |
Bug | SPIREDOC-10456 | Fixes the issue that the borderless text boxes added using the WPS tool were not successfully parsed. |
Bug | SPIREDOC-10554 | Fixes the issue that the images didn't display when adding HTML that contains the image paths starting with https. |
Bug | SPIREDOC-10587 | Fixes the issue that the application threw the exception "Unknown boolean value" when converting Word to PDF. |
Bug | SPIREDOC-10738 | Fixes the issue that the customXml appeared after replacing text. |
Bug | SPIREDOC-10743 | Optimizes the file size for converting Word to OFD. |
Spire.XLS
Category | ID | Description |
New feature | SPIREXLS-5274 | Supports obtaining custom properties of worksheets.
Workbook workbook = new Workbook (); workbook.LoadFromFile("funds-test.xlsx"); ICustomPropertiesCollection customProperties = workbook.Worksheets[0].CustomProperties; for (int i = 0; i < customProperties.Count; i++) { XlsCustomProperty xcp = customProperties[i]; string name = xcp.Name; string value = xcp.Value; } |
New feature | SPIREXLS-5306 | Supports obtaining the original document name of the embedded OLE object.
ole.OleOriginName |
New feature | SPIREXLS-5254 | Supports enabling revision mode.
Workbook.TrackedChanges=true;//default value is false |
New feature | SPIREXLS-5348 | Supports setting global custom font folders.
Workbook.SetGlobalCustomFontsFolders(string[] fontPath); |
Bug | SPIREXLS-5271 | Fixes the issue that the cell data was inaccurate when converting Excel to images. |
Bug | SPIREXLS-5298 | Fixes the issue that the cell contents were lost when converting Excel to PDF. |
Bug | SPIREXLS-5304 | Fixes the issue that the row height was incorrect when converting Excel to PDF. |
Bug | SPIREXLS-5309 | Fixes the issue that the pagination was incorrect when converting Excel to PDF. |
Bug | SPIREXLS-5320 | Fixes the issue that the first 10 data entries were incorrect after adding a "FilterTop10" filter in a worksheet. |
Bug | SPIREXLS-5196 | Fixes the issue that the checkbox was not converted to image format when converting Excel to PDF. |
Bug | SPIREXLS-5305 | Fixes the issue that multiple calls to the ApplyStyleToRange method in pivot tables result in incorrect results. |
Bug | SPIREXLS-5308 | Fixes the issue that setting the color transparency of rich text did not take effect. |
Bug | SPIREXLS-5317 | Fixes the issue that the Filter formula is calculated incorrectly. |
Bug | SPIREXLS-5330 | Fixes the issue that pivot table column names are incorrect when converting Excel to images. |
Bug | SPIREXLS-5345 | Fixes the issue that the text location is incorrect when converting Excel to PDF. |
Bug | SPIREXLS-5349 | Fixes the issue that ChartSheet retrieval is incorrect. |
Bug | SPIREXLS-5352 | Fixes the issue that cell content is incorrect when converting Excel to HTML or images. |
Spire.PDF
Category | ID | Description |
New feature | SPIREPDF-5742 | Adds the PreserveAllowedMetadata property to support preserving XMP data when converting PDF to PDFA format documents.
PdfStandardsConverter converter = new PdfStandardsConverter(stream); converter.Options.PreserveAllowedMetadata = true; |
New feature | - | Supports converting HTML to PDF using the Chrome plugin. Save the resulting PDF document to a file: string chromeLocation = baseDirectory + "chrome\\Chrome-bin\\chrome.exe"; ChromeHtmlConverter converter = new ChromeHtmlConverter(chromeLocation); ConvertOptions options = new ConvertOptions(); options.Timeout = 10 * 1000; options.PageSettings = new PageSettings() { PaperWidth = 8.77, PaperHeight = 6.20, MarginBottom = 0, MarginTop = 0, MarginLeft = 0, MarginRight = 0 }; converter.ConvertToPdf("https://www.e-iceblue.com/", outputFile, options); var pdfAsStream = new MemoryStream(); string chromeLocation = baseDirectory + "chrome\\Chrome-bin\\chrome.exe"; ChromeHtmlConverter converter = new ChromeHtmlConverter(chromeLocation); ConvertOptions options = new ConvertOptions(); options.Timeout = 10 * 1000; options.PageSettings = new PageSettings() { PaperWidth = 8.77, PaperHeight = 6.2, MarginBottom = 0, MarginTop = 0, MarginLeft = 0, MarginRight = 0 }; converter.ConvertToPdf("https://www.e-iceblue.com/", pdfAsStream, options); |
New feature | SPIREPDF-6820 | Optimizes the layout of Excel documents converted from PDF documents. Use the new code: pdfDocument.ConvertOptions.SetPdfToXlsxOptions(new XlsxLineLayoutOptions(true, false, false)); //bool convertToMultipleSheet, bool rotatedText, bool splitCell |
Optimization | SPIREPDF-5744 | Optimizes the time consumption of PDF to image conversion. |
Bug | SPIREPDF-6789 | Fixes the issue that the program reported an error when printing PDF documents. |
Bug | SPIREPDF-6806 | Fixes the issue that images in PDF documents converted from XPS documents were not displayed in the PDF.js viewer. |
Bug | SPIREPDF-6848 | Fixes the issue that the border was lost after filling the text field with content and opening the document in Google Chrome or previewing it in Adobe when printing. |
Bug | SPIREPDF-6896 | Fixes the issue that the program threw System.NullReferenceException when converting a PDF document to a PDFA1B document. |
Bug | SPIREPDF-6898 SPIREPDF-6899 |
Fixes the issue that the program threw System.NullReferenceException when extracting text from PDF documents. |
Bug | SPIREPDF-6906 | Fixes the issue that the program threw System.ArgumentNullException when converting PDF documents to images. |
Bug | SPIREPDF-6832 | Fixes the issue that verification of signature validity was inaccurate. |
Bug | SPIREPDF-6836 | Fixes the issue that removing the PdfWatermarkAnnotationWidget object failed. |
Bug | SPIREPDF-6875 | Fixes the issue that printing PDF to B5 paper size was incorrect. |
Bug | SPIREPDF-2659 SPIREPDF-4454 |
Fixes the issue that importing and exporting form data (in FDF, XFDF, and XML formats) was incorrect. |
Bug | SPIREPDF-6797 | Fixes the issue that highlighting covered the text after converting PDF to images. |
Bug | SPIREPDF-6896 | Fixes the issue that the program threw an exception System.NullReferenceException when converting an OFD document to a PDF document. |
Bug | SPIREPDF-6908 | Fixes the issue that the contents were blank after converting a PDF document to images. |
Bug | SPIREPDF-6910 | Fixes the issue that it failed to get the action script of annotations. |
Bug | SPIREPDF-6922 | Fixes the issue that the program threw an exception System.ArgumentException when importing FDF file data to PDF. |
Bug | SPIREPDF-6925 | Fixes the issue that spaces were lost when copying content to Notepad after adding content to a PDF document using the PdfTaggedContent interface. |
Bug | SPIREPDF-6929 SPIREPDF-6940 |
Fixes the issue that the program threw an exception System.OutOfMemoryException when converting a PDF document to images. |
Bug | SPIREPDF-6941 | Fixes the issue that signatures were reversed by mirroring after flattening form fields. |
Bug | SPIREPDF-6949 | Fixes the issue that the program threw an exception System.NullReferenceException when converting a PDF document to an Excel document. |
Bug | SPIREPDF-6968 | Fixes the issue that the program threw an exception System.NullReferenceException when loading a PDF document. |
Spire.Presentation
Category | ID | Description |
New feature | - | Supports converting PowerPoint documents to Markdown files.
Presentation ppt = new Presentation(); ppt.LoadFromFile("1.pptx"); ppt.SaveToFile("1.md", FileFormat.Markdown); ppt.Dispose(); |
Bug | SPIREPPT-2522 | Fixes the issue that the waterfall chart is displayed incorrectly after modifying its data. |
Bug | SPIREPPT-2534 | Fixes the issue that the program threw System.ArgumentException when setting document property "_MarkAsFinal". |
Bug | SPIREPPT-2535 | Fixes the issue that the tilt angle of text was lost after converting slides to pictures. |
Bug | SPIREPPT-2547 | Fixes the issue that the application threw the exception "System.ArgumentOutOfRangeException" when copying slides to a new PowerPoint presentation. |
Bug | SPIREPPT-2549 | Fixes the issue that the message "PowerPoint found a problem with content" prompted when opening the resulting file generated after copying slides to a new PowerPoint presentation. |
Spire.OCR
Category | ID | Description |
New feature | - | Adds the ConfigureOptions class and new method ConfigureDependencies(ConfigureOptions configureOptions), which supports configuring OCR models, languages, and dependency libraries.
OcrScanner scanner = new OcrScanner(); // Create a new instance of the ConfigureOptions class to set up the scanner configuration ConfigureOptions configureOptions = new ConfigureOptions(); // Set the path to the model configureOptions.ModelPath = @"D:\Model&Lib\Model\V4.Server"; // Set the path to the dependency libraries configureOptions.LibPath = @"D:\Model&Lib\Lib\V2.7\Net\X64"; // Specify the language configureOptions.Language = "Chinese"; // Apply the configuration options to the scanner using the ConfigureDependencies method scanner.ConfigureDependencies(configureOptions); scanner.Scan("test.tif"); |
Spire.OfficeViewer
Category | ID | Description |
Bug | SPIREPPT-2502 | Fixes the issue that the warning watermark of presentations cannot be removed after applying a valid license. |
Spire.Presentation for Java 9.8.3 support setting format when replacing text
We are pleased to announce the release of Spire.Presentation for Java 9.8.3. This version provides a new method to set format when replacing text, and also fixes some known issues occurred when loading PowerPoint files, adding formulas in table cells, and adding LatexMath. More details are listed below.
Here is a list of changes made in this release
Category | ID | Description |
New feature | SPIREPPT-2579 | Provides the ReplaceAndFormatText(String matchedString, String newValue, PortionFormatEx format) method to support setting format when replacing text.
Presentation ppt = new Presentation(); // Load a PowerPoint presentation from the specified file. ppt.loadFromFile(inputFile); // Create a new object to store the default text range formatting properties. PortionFormatEx format = new PortionFormatEx(); // Set the IsBold property of the text range formatting to true, making the text bold. format.isBold(TriState.TRUE); // Set the FillType property of the text range fill to Solid, indicating a solid fill color. format.getFill().setFillType(FillFormatType.SOLID); // Set the Color property of the solid fill color to red. format.getFill().getSolidColor().setColor(Color.red); // Set the FontHeight property of the text range formatting to 25, indicating the font size. format.setFontHeight(25); format.isBold(TriState.TRUE); // Replace all occurrences of the text "Spire.Presentation for .NET" with "Spire.PPT" and apply the specified formatting. ppt.ReplaceAndFormatText("Spire.Presentation for .NET", "Spire.PPT", format); // Save the modified presentation to the specified output file in the PPTX format compatible with PowerPoint 2016. ppt.saveToFile(outputFile, FileFormat.PPTX_2016); // Dispose of the Presentation object to free up resources ppt.dispose(); |
Bug | SPIREPPT-2544 | Fixes the issue that the content did not fit automatically after changing the page orientation. |
Bug | SPIREPPT-2562 | Fixes the issue that the formulas were incorrect when adding LatexMath. |
Bug | SPIREPPT-2577 | Fixes the issue where Chinese characters in formula were displayed as "x". |
Bug | SPIREPPT-2589 | Fixes the issue that the application threw the exception "java.lang.ClassCastException" when adding formulas in table cells. |
Bug | SPIREPPT-2593 | Fixes the issue that the application threw the exception "OutOfMemory" when loading PowerPoint files. |
Java: Convert HTML to Excel
HTML files often contain valuable datasets embedded within tables. However, analyzing this data directly in HTML can be cumbersome and inefficient. Converting HTML tables to Excel format allows you to take advantage of Excel's powerful data manipulation and analysis tools, making it easier to sort, filter, and visualize the information. Whether you need to analyze data for a report, perform calculations, or simply organize it in a more user-friendly format, converting HTML to Excel streamlines the process. In this article, we will demonstrate how to convert HTML files to Excel format in Java using Spire.XLS for Java.
Install Spire.XLS for Java
First of all, you're required to add the Spire.Xls.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.
<repositories> <repository> <id>com.e-iceblue</id> <name>e-iceblue</name> <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url> </repository> </repositories> <dependencies> <dependency> <groupId>e-iceblue</groupId> <artifactId>spire.xls</artifactId> <version>14.8.2</version> </dependency> </dependencies>
Convert HTML to Excel in Java
Spire.XLS for Java provides the Workbook.loadFromHtml() method for loading an HTML file. Once the HTML file is loaded, you can convert it to Excel format using the Workbook.saveToFile() method. The detailed steps are as follows.
- Create an object of the Workbook class.
- Load an HTML file using the Workbook.loadFromHtml() method.
- Save the HTML file in Excel format using the Workbook.saveToFile() method.
- Java
import com.spire.xls.ExcelVersion; import com.spire.xls.Workbook; public class ConvertHtmlToExcel { public static void main(String[] args) { // Specify the input HTML file path String filePath = "C:\\Users\\Administrator\\Desktop\\Sample.html"; // Create an object of the workbook class Workbook workbook = new Workbook(); // Load the HTML file workbook.loadFromHtml(filePath); // Save the HTML file in Excel XLSX format String result = "C:\\Users\\Administrator\\Desktop\\ToExcel.xlsx"; workbook.saveToFile(result, ExcelVersion.Version2013); workbook.dispose(); } }
Insert HTML String into Excel in Java
In addition to converting HTML files to Excel, Spire.XLS for Java allows you to insert HTML strings directly into Excel cells using the CellRange.setHtmlString() method. The detailed steps are as follows.
- Create an object of the Workbook class.
- Get a specific worksheet by its index (0-based) using the Workbook.getWorksheets().get(index) method.
- Get the cell that you want to add an HTML string to using the Worksheet.getCellRange() method.
- Add an HTML sting to the cell using the CellRange.setHtmlString() method.
- Save the resulting workbook to a new file using the Workbook.saveToFile() method.
- Java
import com.spire.xls.CellRange; import com.spire.xls.ExcelVersion; import com.spire.xls.Workbook; import com.spire.xls.Worksheet; public class InsertHtmlStringInExcelCell { public static void main(String[] args) { // Create an object of the workbook class Workbook workbook = new Workbook(); // Get the first sheet Worksheet sheet = workbook.getWorksheets().get(0); // Specify the HTML string String htmlCode = "<p><font size='12'>This is a <b>paragraph</b> with <span style='color: red;'>colored text</span>.</font></p>"; // Get the cell that you want to add the HTML string to CellRange range = sheet.getCellRange("A1"); // Add the HTML string to the cell range.setHtmlString(htmlCode); // Auto-adjust the width of the first column based on its content sheet.autoFitColumn(1); // Save the resulting workbook to a new file String result = "C:\\Users\\Administrator\\Desktop\\InsertHtmlStringIntoCell.xlsx"; workbook.saveToFile(result, ExcelVersion.Version2013); workbook.dispose(); } }
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Python: Create Actions in PDF Documents
One powerful feature that enhances the interactivity and utility of PDF documents is the actions in these documents. By embedding actions such as document jumping, navigation controls, or even media playing, users can transform static documents into dynamic tools that streamline workflows, improve user engagement, and automate routine tasks, making the use of PDFs more efficient and versatile than ever before. This article will show how to use Spire.PDF for Python to create actions in PDF documents with Python code effortlessly.
- Create a Navigation Action in PDF with Python
- Create a Sound Action in PDF with Python
- Create a File Open Action in PDF with Python
Install Spire.PDF for Python
This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.PDF
If you are unsure how to install, please refer to: How to Install Spire.PDF for Python on Windows
Create a Navigation Action in PDF with Python
A navigation button is an action that allows users to jump to a specified position on a designated page within a document. Developers can create a PdfDestination object, use it to create a PdfGoToAction, and then create an annotation based on this object and add it to the page to complete the creation of the navigation button. The following are the detailed steps:
- Create an object of PdfDocument class and load a PDF document using PdfDocument.LoadFromFile() method.
- Create a PdfDestination object and set its property.
- Create a PdfGoToAction object based on the destination.
- Draw a rectangle on a page using PdfPageBase.Canvas.DrawRectangle() method.
- Create a PdfActionAnnotation object based on the action and add it to the page using PdfPageBase.Annotations.Add() method.
- Save the document using PdfDocument.SaveToFile() method.
- Python
from spire.pdf import * # Create an instance of PdfDocument class and load a PDF document pdf = PdfDocument() pdf.LoadFromFile("Sample.pdf") # Create a PdfDestination instance and set its properties destination = PdfDestination(pdf.Pages[0]) destination.Location = PointF(0.0, 0.0) destination.Mode = PdfDestinationMode.Location destination.Zoom = 0.8 # Create a rectangle rect = RectangleF.FromLTRB(70, pdf.PageSettings.Size.Height - 120, 140, pdf.PageSettings.Size.Height - 100) # Create a PdfGoToAction instance action = PdfGoToAction(destination) # Draw a rectangle on the second page pdf.Pages.get_Item(1).Canvas.DrawRectangle(PdfBrushes.get_LightGray(), rect) # Draw text of the button font = PdfFont(PdfFontFamily.TimesRoman, 14.0) stringFormat = PdfStringFormat(PdfTextAlignment.Center) pdf.Pages.get_Item(1).Canvas.DrawString("To Page 1", font, PdfBrushes.get_Green(), rect, stringFormat) # Create a PdfActionAnnotation instance annotation = PdfActionAnnotation(rect, action) # Add the annotation to the second page pdf.Pages.get_Item(1).Annotations.Add(annotation) # Save the document pdf.SaveToFile("output/AddPDFNavigationButton.pdf") pdf.Close()
Create a Sound Action in PDF with Python
Developers can embed audio as actions in PDF documents, which allows the audio to play when the user performs a specified action, such as playing when the file opens or when a button is clicked. The following are the steps for creating a sound action:
- Create an instance of PdfDocument class.
- Load a PDF document using PdfDocument.LoadFromFile() method.
- Create an instance of PdfSoundAction class with an audio file.
- Set the audio parameters through properties under PdfSound class.
- Set the playing parameters through properties under PdfSoundAction class.
- Get a page using PdfDocument.Pgaes.get_Item(() method.
- Draw an image on the page using PdfPageBase.Canvas.Draw() method.
- Create a PdfActionAnnotation object with the sound action at the location of the image.
- Add the annotation to the page
- Or you can only set the sound action as the action performed after the document is opened through PdfDocument.AfterOpenAction property. This doesn’t need to add it as an annotation on a PDF page.
- Save the document using PdfDocument.SaveToFile() method.
- Python
from spire.pdf import * # Create a PdfDocument instance and load a PDF file pdf = PdfDocument() pdf.LoadFromFile("Sample.pdf") # Get the first page of the document page = pdf.Pages.get_Item(0) # Create an instance of PdfSoundAction with the sound file path soundAction = PdfSoundAction("Wave.wav") # Set the audio parameters soundAction.Sound.Bits = 16 soundAction.Sound.Channels = PdfSoundChannels.Stereo soundAction.Sound.Encoding = PdfSoundEncoding.Signed soundAction.Sound.Rate = 44100 # Set the playing parameters soundAction.Volume = 0.5 soundAction.Repeat = True soundAction.Mix = True soundAction.Synchronous = False # Draw an image on the page image = PdfImage.FromFile("Sound.png") page.Canvas.DrawImage(image, PointF(30.0, 30.0)) # Create an instance of PdfActionAnnotation with the sound action rect = RectangleF.FromLTRB(30.0, 30.0, image.GetBounds().Width + 30.0, image.GetBounds().Height + 30.0) annotation = PdfActionAnnotation(rect, soundAction) # Add the annotation to the page page.Annotations.Add(annotation) # Set the sound action to play after the document is opened # pdf.AfterOpenAction = soundAction # Save the document pdf.SaveToFile("output/AddMusicPDF.pdf") pdf.Close()
Create a File Open Action in PDF with Python
The PdfLaunchAction class represents a file open action in PDF that allows users to open the corresponding file by clicking on a button on a PDF page. Developers can specify the absolute or relative path of the file to be opened and whether to open in a new window when creating a file open action. The detailed steps for creating a file open action in a PDF document are as follows:
- Create an object of PdfDocument class and load a PDF document using PdfDocument.LoadFromFile() method.
- Get a page of the document using PdfDocument.Pages.get_Item() method.
- Draw a rectangle on the page using PdfPageBase.Canvas.DrawRectangle() method.
- Create an object of PdfLaunchAction class and specify the file path and path type.
- Set the opening mode to new window through PdfLaunchAction.IsNewWindow property.
- Create an object of PdfActionAnnotation class based on the action and set its color through PdfActionAnnotation.Color property.
- Add the annotation to the page using PdfPageBase.Annotations.Add() method.
- Save the document using PdfDocument.SaveToFile() method.
- Python
from spire.pdf import * # Create an instance of PdfDocument class pdf = PdfDocument() # Load a PDF file pdf.LoadFromFile("Sample.pdf") # Get the first page of the document page = pdf.Pages.get_Item(0) # Draw a rectangle on the page rect = RectangleF.FromLTRB(50, pdf.PageSettings.Size.Height - 100, 200, pdf.PageSettings.Size.Height - 80) page.Canvas.DrawRectangle(PdfPens.get_LightGray(), rect) # Draw text in the rectangle page.Canvas.DrawString("Click to open Sample 2", PdfFont(PdfFontFamily.Helvetica, 14.0), PdfBrushes.get_Green(), rect, PdfStringFormat(PdfTextAlignment.Center)) # Create a PdfLaunchAction object action = PdfLaunchAction("Sample2.pdf", PdfFilePathType.Relative) action.IsNewWindow = True # Create a PdfActionAnnotation object based on the action annotation = PdfActionAnnotation(rect, action) annotation.Color = PdfRGBColor(Color.get_Blue()) # Add the annotation to the page page.Annotations.Add(annotation) # Save the document pdf.SaveToFile("output/CreatePDFLaunchAction.pdf") pdf.Close()
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Java: Copy Content from One Word Document to Another
Transferring content between Microsoft Word documents is a frequent task for many users. Whether you need to consolidate information spread across multiple files or quickly reuse existing text and other elements, the ability to effectively copy and paste between documents can save you time and effort.
In this article, you will learn how to copy content from one Word document to another using Java and Spire.Doc for Java.
- Copy Specified Paragraphs from One Word Document to Another
- Copy a Section from One Word Document to Another
- Copy the Entire Document and Append it to Another
- Create a Copy of a Word Document
Install Spire.Doc for Java
First of all, you're required to add the Spire.Doc.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.
<repositories> <repository> <id>com.e-iceblue</id> <name>e-iceblue</name> <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url> </repository> </repositories> <dependencies> <dependency> <groupId>e-iceblue</groupId> <artifactId>spire.doc</artifactId> <version>12.9.0</version> </dependency> </dependencies>
Copy Specified Paragraphs from One Word Document to Another in Java
Spire.Doc for Java provides a flexible way to copy content between Microsoft Word documents. This is achieved by cloning individual paragraphs and then adding those cloned paragraphs to a different document.
To copy specific paragraphs from one Word document to another, you can follow these steps:
- Load the source document into a Document object.
- Load the target document into a separate Document object.
- Identify the paragraphs you want to copy from the source document.
- Create copies of those selected paragraphs using Paragraph.deepClone() method
- Add the cloned paragraphs to the target document using ParagraphCollection.add() method.
- Save the updated target document to a new Word file.
- Java
import com.spire.doc.Document; import com.spire.doc.FileFormat; import com.spire.doc.Section; import com.spire.doc.documents.Paragraph; public class CopyParagraphs { public static void main(String[] args) { // Create a Document object Document sourceDoc = new Document(); // Load the source file sourceDoc.loadFromFile("C:\\Users\\Administrator\\Desktop\\source.docx"); // Get a specific section Section section = sourceDoc.getSections().get(0); // Get the specified paragraphs from the source file Paragraph p1 = section.getParagraphs().get(2); Paragraph p2 = section.getParagraphs().get(3); // Create another Document object Document targetDoc = new Document(); // Load the target file targetDoc.loadFromFile("C:\\Users\\Administrator\\Desktop\\target.docx"); // Get the last section Section lastSection = targetDoc.getLastSection(); // Add the paragraphs from the source file to the target file lastSection.getParagraphs().add((Paragraph)p1.deepClone()); lastSection.getParagraphs().add((Paragraph)p2.deepClone()); // Save the target file to a different Word file targetDoc.saveToFile("CopyParagraphs.docx", FileFormat.Docx_2019); // Dispose resources sourceDoc.dispose(); targetDoc.dispose(); } }
Copy a Section from One Word Document to Another in Java
When copying content between Microsoft Word documents, it's important to consider that a section can contain not only paragraphs, but also other elements like tables. To successfully transfer an entire section from one document to another, you need to iterate through all the child objects within the section and add them individually to a specific section in the target document.
The steps to copy a section between different Word documents are as follows:
- Create Document objects to load the source file and the target file, respectively.
- Get the specified section from the source document.
- Iterate through the child objects within the section.
- Clone a specific child object using DocumentObject.deepClone() method.
- Add the cloned child objects to a designated section in the target document using DocumentObjectCollection.add() method.
- Save the updated target document to a new file.
- Java
import com.spire.doc.Document; import com.spire.doc.DocumentObject; import com.spire.doc.FileFormat; import com.spire.doc.Section; public class CopySection { public static void main(String[] args) { // Create a Document object Document sourceDoc = new Document(); // Load the source file sourceDoc.loadFromFile("C:\\Users\\Administrator\\Desktop\\source.docx"); // Get the specified section from the source file Section section = sourceDoc.getSections().get(0); // Create another Document object Document targetDoc = new Document(); // Load the target file targetDoc.loadFromFile("C:\\Users\\Administrator\\Desktop\\target.docx"); // Get the last section of the target file Section lastSection = targetDoc.getLastSection(); // Iterate through the child objects in the selected section for (int i = 0; i < section.getBody().getChildObjects().getCount(); i++) { // Get a specific child object DocumentObject childObject = section.getBody().getChildObjects().get(i); // Add the child object to the last section of the target file lastSection.getBody().getChildObjects().add(childObject.deepClone()); } // Save the target file to a different Word file targetDoc.saveToFile("CopySection.docx", FileFormat.Docx_2019); // Dispose resources sourceDoc.dispose(); targetDoc.dispose(); } }
Copy the Entire Document and Append it to Another in Java
Copying the full contents from one Microsoft Word document into another can be achieved using the Document.insertTextFromFile() method. This method enables you to seamlessly append the contents of a source document to a target document.
The steps to copy an entire document and append it to another are as follows:
- Create a Document object to represent the target file.
- Load the target file from the given file path.
- Insert the content of a different Word document into the target file using Document.insertTextFromFile() method.
- Save the updated target file to a new Word document.
- Java
import com.spire.doc.Document; import com.spire.doc.FileFormat; public class CopyEntireDocument { public static void main(String[] args) { // Specify the path of the source document String sourceFile = "C:\\Users\\Administrator\\Desktop\\source.docx"; // Create a Document object Document targetDoc = new Document(); // Load the target file targetDoc.loadFromFile("C:\\Users\\Administrator\\Desktop\\target.docx"); // Insert content of the source file to the target file targetDoc.insertTextFromFile(sourceFile, FileFormat.Docx); // Save the target file to a different Word file targetDoc.saveToFile("CopyEntireDocument.docx", FileFormat.Docx_2019); // Dispose resources targetDoc.dispose(); } }
Create a Copy of a Word Document in Java
Spire.Doc for Java provides a straightforward way to create a duplicate of a Microsoft Word document by using the Document.deepClone() method.
To make a copy of a Word document, follow these steps:
- Create a Document object to relisent the source document.
- Load a Word file from the given file path.
- Create a copy of the document using Document.deepClone() method.
- Save the cloned document to a new Word file.
- Java
import com.spire.doc.Document; import com.spire.doc.FileFormat; public class DuplicateDocument { public static void main(String[] args) { // Create a new document object Document sourceDoc = new Document(); // Load a Word file sourceDoc.loadFromFile("C:\\Users\\Administrator\\Desktop\\target.docx"); // Clone the document Document newDoc = sourceDoc.deepClone(); // Save the cloned document as a docx file newDoc.saveToFile("Copy.docx", FileFormat.Docx); // Dispose resources sourceDoc.dispose(); newDoc.dispose(); } }
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Spire.Office for Java 9.8.0 is released
We are pleased to announce the release of Spire.Office for Java 9.8.0. In this new version, Spire.XLS for Java supports embedding images in cells; Spire.Doc for Java optimizes the clarity of images when converting Word documents to HTML. Besides, some known issues are fixed successfully in this update. More details are listed below.
Here is a list of changes made in this release
Spire.XLS for Java
Category | ID | Description |
New feature | SPIREXLS-5147 | Provides the hideCategoriTags method to support hiding category labels.
Chart chart = sheet.getCharts().get(0); String[] labels = chart.getCategoryLabels(); chart.hideCategoryLabels(new String[] { labels [0], labels [1], ...} ); |
New feature | SPIREXLS-5338 | Supports embedding images in cells.
worksheet.getCellRange("B1").insertOrUpdateCellImage("D:\\vs1.png",true); |
Bug | SPIREXLS-5331 | Fixes the issue that caused incorrect content when saving an Excel document and opening it in Microsoft Excel 2016. |
Bug | SPIREXLS-5337 | Fixes the issue that resulted in incorrect pivot table data calculation. |
Spire.PDF for Java
Category | ID | Description |
Bug | SPIREPDF-6851 | Fixes the issue that the program threw "NullPointerException" when converting PDF to SVG. |
Bug | SPIREPDF-6881 | Fixes the issue that some punctuation marks were missing when extracting table text. |
Bug | SPIREPDF-6895 | Fixes the issue that the content was lost when converting OFD to PDF. |
Bug | SPIREPDF-6923 | Fixes the issue that the program threw "doc-0/res/doc-0/res/res7651308984730378845.png cannot be found!" when converting OFD to PDF. |
Bug | SPIREPDF-6924 | Fixes the issue that the program threw "ArrangStoreException" while reading Tiff files. |
Bug | SPIREPDF-6939 | Fixes the issue that the program threw "NullPointerException" when converting PDF to Word. |
Bug | SPIREPDF-6947 | Fixes the issue that the font styles were incorrect when converting PDF to PPTX. |
Spire.Doc for Java
Category | ID | Description |
Optimization | SPIREDOC-10600 | Improves clarity of images when converting Word documents to HTML. |
Bug | SPIREDOC-10546 SPIREDOC-10601 |
Fixes the issue that the images were lost when converting HTML to Word. |
Bug | SPIREDOC-10562 | Fixes the problem that the program threw "Unknown boolex value" exception when converting Word to PDF. |
Bug | SPIREDOC-10688 | Fixes the issue that the table borders were lost when the saved Doc document was opened in WPS tool. |
Python: Detect Page Orientation or Rotation Angle in PDF
Proper presentation of a PDF document is critical for maintaining its accuracy and professionalism. By checking the orientation and rotation of each PDF page, you can confirm that all elements, including diagrams and images, are displayed correctly as intended on the viewing device or platform, thus avoiding confusion or misinterpretation of content. In this article, you will learn how to detect the orientation and rotation angle of a PDF page in Python using Spire.PDF for Python.
Install Spire.PDF for Python
This scenario requires Spire.PDF for Python. It can be easily installed in your Windows through the following pip command.
pip install Spire.PDF
If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows
Detect PDF Page Orientation in Python
Page orientation is determined by the relationship between page width and height. Using Spire.PDF for Python, you can compare these two values to detect whether a page is landscape (width greater than height) or portrait (width less than height). The following are the detailed steps.
- Create a PdfDocument instance.
- Load a PDF file using PdfDocument.LoadFromFile() method.
- Get a specified page using PdfDocument.Pages[] property.
- Get the width and height of the PDF page using PdfPageBase.Size.Width and PdfPageBase.Size.Height properties.
- Compare the values of page width and height to detect the page orientation.
- Print out the result.
- Python
from spire.pdf.common import * from spire.pdf import * # Create a PdfDocument object pdf = PdfDocument() # Load a PDF file from disk pdf.LoadFromFile("SamplePDF.pdf") # Get the first page page = pdf.Pages[0] # Get the width and height of the page Width = page.Size.Width Height = page.Size.Height # Compare the values of page width and height if Width > Height: print("The page orientation is Landscape.") else: print("The page orientation is Portrait.")
Detect PDF Page Rotation Angle in Python
PDF pages can be rotated by a certain angle. To detect the rotation angle of a PDF page, you can use the PdfPageBase.Rotation property. The following are the detailed steps.
- Create a PdfDocument instance.
- Load a PDF file using PdfDocument.LoadFromFile() method.
- Get a specified page using PdfDocument.Pages[] property.
- Get the rotation angle of the page using PdfPageBase.Rotation property, and then convert it to text string.
- Print out the result.
- Python
from spire.pdf.common import * from spire.pdf import * # Create a PdfDocument object pdf = PdfDocument() # Load a PDF file from disk pdf.LoadFromFile("Sample.pdf") # Get the first page page = pdf.Pages[0] # Get the rotation angle of the current page rotationAngle = page.Rotation rotation = str(rotationAngle) # Print out the result print("The rotation angle is: " + rotation)
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Spire.Doc 12.8.12 optimizes the file size for converting Word to OFD
We're pleased to announce the release of Spire.Doc 12.8.12. This version mainly fixes two issues that occurred when replacing text and converting Word to OFD. More details are listed below.
Here is a list of changes made in this release
Category | ID | Description |
Bug | SPIREDOC-10738 | Fixes the issue that the customXml appeared after replacing text. |
Bug | SPIREDOC-10743 | Optimizes the file size for converting Word to OFD. |
C#: Get Page Size, Orientation and Rotation of PDF
In some cases, checking the size, orientation, and rotation of PDF pages can be part of the quality control process. For example, before publishing or distributing a document, you might need to verify this information to ensure that all pages in the document are correctly presented. In this article, you will learn how to get PDF page size, orientation and rotation angle in C# using Spire.PDF for .NET.
Install Spire.PDF for .NET
To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.PDF
Get PDF Page Size in C#
Spire.PDF for .NET offers the PdfPageBase.Size.Width and PdfPageBase.Size.Height properties to get the width and height of a PDF page in points. If you want to convert the default unit of measure to other units, you can use the PdfUnitConvertor class. The following are the detailed steps.
- Create a PdfDocument instance.
- Load a PDF file using PdfDocument.LoadFromFile() method.
- Get a specified page using PdfDocument.Pages[] property.
- Get the width and height of the PDF page using PdfPageBase.Size.Width and PdfPageBase.Size.Height properties.
- Create a PdfUnitConvertor instance, and then convert the size units from points to other units of measure using PdfUnitConvertor.ConvertUnits() method.
- Add the page size information to a StringBuilder instance, and then save the result to a TXT file.
- C#
using System.Text; using Spire.Pdf; using Spire.Pdf.Graphics; namespace GetPDFPageSize { class Program { static void Main(string[] args) { //Create a PdfDocument object PdfDocument pdf = new PdfDocument(); //Load a PDF file from disk pdf.LoadFromFile("SamplePDF.pdf"); //Get the first page PdfPageBase page = pdf.Pages[0]; //Get the width and height of the page in "point" float pointWidth = page.Size.Width; float pointHeight = page.Size.Height; //Create PdfUnitConvertor to convert the unit PdfUnitConvertor unitCvtr = new PdfUnitConvertor(); //Convert size units from points to pixels float pixelWidth = unitCvtr.ConvertUnits(pointWidth, PdfGraphicsUnit.Point, PdfGraphicsUnit.Pixel); float pixelHeight = unitCvtr.ConvertUnits(pointHeight, PdfGraphicsUnit.Point, PdfGraphicsUnit.Pixel); //Convert size units from points to inches float inchWidth = unitCvtr.ConvertUnits(pointWidth, PdfGraphicsUnit.Point, PdfGraphicsUnit.Inch); float inchHeight = unitCvtr.ConvertUnits(pointHeight, PdfGraphicsUnit.Point, PdfGraphicsUnit.Inch); //Convert size units from points to centimeters float centimeterWidth = unitCvtr.ConvertUnits(pointWidth, PdfGraphicsUnit.Point, PdfGraphicsUnit.Centimeter); float centimeterHeight = unitCvtr.ConvertUnits(pointHeight, PdfGraphicsUnit.Point, PdfGraphicsUnit.Centimeter); //Create a StringBuilder instance StringBuilder content = new StringBuilder(); //Add the page size information to the StringBuilder instance content.AppendLine("The page size of the file in points is (width: " + pointWidth + "pt, height: " + pointHeight + "pt)."); content.AppendLine("The page size of the file in pixels is (width: " + pixelWidth + "pixel, height: " + pixelHeight + "pixel)."); content.AppendLine("The page size of the file in inches is (width: " + inchWidth + "inch, height: " + inchHeight + "inch)."); content.AppendLine("The page size of the file in centimeters is (width: " + centimeterWidth + "cm, height: " + centimeterHeight + "cm)."); //Save to a txt file File.WriteAllText("GetPageSize.txt", content.ToString()); } } }
Detect PDF Page Orientation in C#
To detect the orientation of a PDF page, you can compare the width and height of the page. If the page width is greater than the height, then the page orientation is landscape, otherwise it is portrait. The following are the detailed steps.
- Create a PdfDocument instance.
- Load a PDF file using PdfDocument.LoadFromFile() method.
- Get a specified page using PdfDocument.Pages[] property.
- Get the width and height of the PDF page using PdfPageBase.Size.Width and PdfPageBase.Size.Height properties.
- Compare the values of page width and height to detect the page orientation.
- Output the result using Console.WriteLine() method.
- C#
using Spire.Pdf; namespace GetPDFPageOrientation { class Program { static void Main(string[] args) { //Create a PdfDocument object PdfDocument pdf = new PdfDocument(); //Load a PDF file from disk pdf.LoadFromFile("SamplePDF.pdf"); //Get the first page PdfPageBase page = pdf.Pages[0]; //Get the width and height of the page float width = page.Size.Width; float height = page.Size.Height; //Compare the values of page width and height if (width > height) { Console.WriteLine("The page orientation is Landscape."); } else { Console.WriteLine("The page orientation is Portrait."); } } } }
Detect PDF Page Rotation Angle in C#
The rotation angle of a PDF page can be obtained through the PdfPageBase.Rotation property. The following are the detailed steps.
- Create a PdfDocument instance.
- Load a PDF file using PdfDocument.LoadFromFile() method.
- Get a specified page using PdfDocument.Pages[] property.
- Get the rotation angle of the page using PdfPageBase.Rotation property, and then convert it to text string.
- Output the result using Console.WriteLine() method.
- C#
using Spire.Pdf; namespace GetPDFPageRotationAngle { class Program { static void Main(string[] args) { //Create a PdfDocument object PdfDocument pdf = new PdfDocument(); //Load a PDF file from disk pdf.LoadFromFile("E:\\PythonPDF\\Sample.pdf"); //Get the first page PdfPageBase page = pdf.Pages[0]; //Get the rotation angle of the current page PdfPageRotateAngle rotationAngle = page.Rotation; string rotation = rotationAngle.ToString(); //Output the page rotation angle information Console.WriteLine("The rotation angle of the current page is: " + rotation); } } }
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
Spire.XLS for Java 14.8.2 supports embedding images in cells
We are excited to announce the release of Spire.XLS for Java.14.8.2. This version supports embedding images in cells. Besides, some known bugs were successfully fixed in this update, such as the issue that caused incorrect content when saving an Excel document and opening it in Microsoft Excel 2016. More details are listed below.
Here is a list of changes made in this release
Category | ID | Description |
New feature | SPIREXLS-5147 | Provides the hideCategoriTags method to support hiding category labels.
Chart chart = sheet.getCharts().get(0); String[] labels = chart.getCategoryLabels(); chart.hideCategoryLabels(new String[] { labels [0], labels [1], ...} ); |
New feature | SPIREXLS-5338 | Supports embedding images in cells.
worksheet.getCellRange("B1").insertOrUpdateCellImage("D:\\vs1.png",true); |
Bug | SPIREXLS-5331 | Fixes the issue that caused incorrect content when saving an Excel document and opening it in Microsoft Excel 2016. |
Bug | SPIREXLS-5337 | Fixes the issue that resulted in incorrect pivot table data calculation. |