Word and Excel are different from each other in terms of their uses and functioning. Word is used primarily for text documents such as essays, emails, letters, books, resumes, or academic papers where text formatting is essential. Excel is used to save data, make tables and charts and make complex calculations.
It is not recommended to convert a complex Word file to an Excel spreadsheet, because Excel can hardly render contents in the same way as Word. However, if your Word document is mainly composed of tables and you want to analyze the table data in Excel, you can use Spire.Office for Java to convert Word to Excel while maintaining good readability.
Install Spire.Office for Java
First of all, you're required to add the Spire.Office.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.
<repositories> <repository> <id>com.e-iceblue</id> <name>e-iceblue</name> <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url> </repository> </repositories> <dependencies> <dependency> <groupId>e-iceblue</groupId> <artifactId>spire.office</artifactId> <version>9.10.0</version> </dependency> </dependencies>
Convert Word to Excel in Java
This scenario actually uses two libraries in the Spire.Office package. They're Spire.Doc for Java and Spire.XLS for Java. The former is used to read and extract content from a Word document, and the latter is used to create an Excel document and write data in the specific cells. To make this code example easy to understand, we created the following three custom methods that preform specific functions.
- exportTableInExcel() - Export data from a Word table to specified Excel cells.
- copyContentInTable() - Copy content from a table cell in Word to an Excel cell.
- copyTextAndStyle() - Copy text with formatting from a Word paragraph to an Excel cell.
The following steps demonstrate how to export data from a Word document to a worksheet using Spire.Office for Java.
- Create a Document object to load a Word file.
- Create a Workbook object and add a worksheet named "WordToExcel" to it.
- Traverse though all the sections in the Word document, traverse through all the document objects under a certain section, and then determine if a document object is a paragraph or a table.
- If the document object is a paragraph, write the paragraph in a specified cell in Excel using coypTextAndStyle() method.
- If the document object is a table, export the table data from Word to Excel cells using exportTableInExcel() method.
- Auto fit the row height and column width in Excel so that the data within a cell will not exceed the bound of the cell.
- Save the workbook to an Excel file using Workbook.saveToFile() method.
- Java
import com.spire.doc.*; import com.spire.doc.documents.Paragraph; import com.spire.doc.fields.DocPicture; import com.spire.doc.fields.TextRange; import com.spire.xls.*; import java.awt.*; public class ConvertWordToExcel { public static void main(String[] args) { //Create a Document object Document doc = new Document(); //Load a Word file doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Invoice.docx"); //Create a Workbook object Workbook wb = new Workbook(); //Remove the default worksheets wb.getWorksheets().clear(); //Create a worksheet named "WordToExcel" Worksheet worksheet = wb.createEmptySheet("WordToExcel"); int row = 1; int column = 1; //Loop through the sections in the Word document for (int i = 0; i < doc.getSections().getCount(); i++) { //Get a specific section Section section = doc.getSections().get(i); //Loop through the document object under a certain section for (int j = 0; j < section.getBody().getChildObjects().getCount(); j++) { //Get a specific document object DocumentObject documentObject = section.getBody().getChildObjects().get(j); //Determine if the object is a paragraph if (documentObject instanceof Paragraph) { CellRange cell = worksheet.getCellRange(row, column); Paragraph paragraph = (Paragraph) documentObject; //Copy paragraph from Word to a specific cell copyTextAndStyle(cell, paragraph); row++; } //Determine if the object is a table if (documentObject instanceof Table) { Table table = (Table) documentObject; //Export table data from Word to Excel int currentRow = exportTableInExcel(worksheet, row, table); row = currentRow; } } } //Wrap text in cells worksheet.getAllocatedRange().isWrapText(true); //Auto fit row height and column width worksheet.getAllocatedRange().autoFitRows(); worksheet.getAllocatedRange().autoFitColumns(); //Save the workbook to an Excel file wb.saveToFile("output/WordToExcel.xlsx", ExcelVersion.Version2013); } //Export data from Word table to Excel cells private static int exportTableInExcel(Worksheet worksheet, int row, Table table) { CellRange cell; int column; for (int i = 0; i < table.getRows().getCount(); i++) { column = 1; TableRow tbRow = table.getRows().get(i); for (int j = 0; j < tbRow.getCells().getCount(); j++) { TableCell tbCell = tbRow.getCells().get(j); cell = worksheet.getCellRange(row, column); cell.borderAround(LineStyleType.Thin, Color.BLACK); copyContentInTable(tbCell, cell); column++; } row++; } return row; } //Copy content from a Word table cell to an Excel cell private static void copyContentInTable(TableCell tbCell, CellRange cell) { Paragraph newPara = new Paragraph(tbCell.getDocument()); for (int i = 0; i < tbCell.getChildObjects().getCount(); i++) { DocumentObject documentObject = tbCell.getChildObjects().get(i); if (documentObject instanceof Paragraph) { Paragraph paragraph = (Paragraph) documentObject; for (int j = 0; j < paragraph.getChildObjects().getCount(); j++) { DocumentObject cObj = paragraph.getChildObjects().get(j); newPara.getChildObjects().add(cObj.deepClone()); } if (i < tbCell.getChildObjects().getCount() - 1) { newPara.appendText("\n"); } } } copyTextAndStyle(cell, newPara); } //Copy text and style of a paragraph to a cell private static void copyTextAndStyle(CellRange cell, Paragraph paragraph) { RichText richText = cell.getRichText(); richText.setText(paragraph.getText()); int startIndex = 0; for (int i = 0; i < paragraph.getChildObjects().getCount(); i++) { DocumentObject documentObject = paragraph.getChildObjects().get(i); if (documentObject instanceof TextRange) { TextRange textRange = (TextRange) documentObject; String fontName = textRange.getCharacterFormat().getFontName(); boolean isBold = textRange.getCharacterFormat().getBold(); Color textColor = textRange.getCharacterFormat().getTextColor(); float fontSize = textRange.getCharacterFormat().getFontSize(); String textRangeText = textRange.getText(); int strLength = textRangeText.length(); ExcelFont font = new ExcelFont(cell.getWorksheet().getWorkbook().createFont()); font.setColor(textColor); font.isBold(isBold); font.setSize(fontSize); font.setFontName(fontName); int endIndex = startIndex + strLength; richText.setFont(startIndex, endIndex, font); startIndex += strLength; } if (documentObject instanceof DocPicture) { DocPicture picture = (DocPicture) documentObject; cell.getWorksheet().getPictures().add(cell.getRow(), cell.getColumn(), picture.getImage()); cell.getWorksheet().setRowHeightInPixels(cell.getRow(), 1, picture.getImage().getHeight()); } } switch (paragraph.getFormat().getHorizontalAlignment()) { case Left: cell.setHorizontalAlignment(HorizontalAlignType.Left); break; case Center: cell.setHorizontalAlignment(HorizontalAlignType.Center); break; case Right: cell.setHorizontalAlignment(HorizontalAlignType.Right); break; } } }
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.