Displaying items by tag: ocr java

Saturday, 14 September 2024 00:59

Java: Extract Text from Images Using the New Model of Spire.OCR for Java

Starting from version 1.9.15, Spire.OCR for Java provides a new model for extracting text from images. In this article, we will demonstrate how to use this new model to extract text from images in Java.

The detailed steps are as follows.

Step 1: Create a Java Project in IntelliJ IDEA.

Extract Text from Images Using the New Model of Spire.OCR for Java

Step 2: Add Spire.OCR.jar to Your Project.

Option 1: Install Spire.OCR for Java via Maven.

If you're using Maven, you can install Spire.OCR for Java by adding the following code to your project's pom.xml file:

Package Manager

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.cn/repository/maven-public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.ocr</artifactId>
        <version>1.9.15</version>
    </dependency>
</dependencies>

Option 2: Manually Import Spire.OCR.jar.

First, download Spire.OCR for Java from the following link and extract it to a specific directory:

https://www.e-iceblue.com/Download/ocr-for-java.html

Next, in IntelliJ IDEA, go to File > Project Structure > Modules > Dependencies. In the Dependencies pane, click the "+" button and select JARs or Directories. Navigate to the directory where Spire.OCR for Java is located, open the lib folder and select the Spire.OCR.jar file, then click OK to add it as the project’s dependency.

Extract Text from Images Using the New Model of Spire.OCR for Java

Step 3: Download the New Model and Associated Dependencies of Spire.OCR for Java.

Download the new model and associated dependencies (Model&Lib.zip) from the following link and extract the package to a specific directory, such as D:\.

https://www.e-iceblue.com/resource/ocr_java/Model&Lib.zip

Extract Text from Images Using the New Model of Spire.OCR for Java

Step 4: Implement Text Extraction from Images Using the New Model of Spire.OCR for Java.

Use the following code to extract text from images with the new OCR model of Spire.OCR for Java:

Java

import com.spire.ocr.*;
import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.IOException;

public class Main {
    public static void main(String[] args) {
        try {
            // Initialize the OcrScanner instance
            OcrScanner scanner = new OcrScanner();

            // Configure OCR options
            // Set the path to the new model and the language for text recognition. Supported languages include English, Chinese, Chinesetraditional, French, German, Japanese, and Korean.
            ConfigureOptions configureOptions = new ConfigureOptions("D:\\Model&Lib\\Model", "English");
            // Set the path to the associated dependencies
            configureOptions.setLibPath("D:\\Model&Lib\\Lib\\win-x64"); 
            scanner.ConfigureDependencies(configureOptions);

            // Perform text extraction from the image
            scanner.scan("Sample.png");

            // Save the extracted text to a file
            saveTextToFile(scanner, "output.txt");

        } catch (OcrException e) {
            e.printStackTrace();
        }
    }

    private static void saveTextToFile(OcrScanner scanner, String filePath) {
        try {
            String text = scanner.getText().toString();
            try (BufferedWriter writer = new BufferedWriter(new FileWriter(filePath))) {
                writer.write(text);
            }
        } catch (IOException | OcrException e) {
            e.printStackTrace();
        }
    }
}

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Recognize Text

How to Scan and Recognize Text from Images in Java Projects

OCR (Optical Character Recognition) technology is the primary method to extract text from images. Spire.OCR for Java provides developers with a quick and efficient solution to scan and extract text from images in Java projects. This article will guide you on how to use Spire.OCR for Java to recognize and extract text from images in Java projects.

Obtaining Spire.OCR for Java

To scan and recognize text in images using Spire.OCR for Java, you need to first import the Spire.OCR.jar file along with other relevant dependencies into your Java project.

You can download Spire.OCR for Java from our website. If you are using Maven, you can add the following code to your project's pom.xml file to import the JAR file into your application.

Package Manager

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.cn/repository/maven-public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.ocr</artifactId>
        <version>1.9.0</version>
    </dependency>
</dependencies>

Please download the other dependencies based on your operating system:

Install Dependencies

Step 1: Create a Java project in IntelliJ IDEA.

How to Scan and Recognize Text from Images in Java Projects

Step 2: Go to File > Project Structure > Modules > Dependencies in the menu and add Spire.OCR.jar as a project dependency.

How to Scan and Recognize Text from Images in Java Projects

Step 3: Download and extract the other dependency files. Copy all the files from the extracted "dependencies" folder to your project directory.

How to Scan and Recognize Text from Images in Java Projects

Scanning and Recognizing Text from a Local Image

Java

import com.spire.ocr.OcrScanner;
import java.io.*;

public class ScanLocalImage {
    public static void main(String[] args) throws Exception {
        // Specify the path to the dependency files
        String dependencies = "dependencies/";
        // Specify the path to the image file to be scanned
        String imageFile = "data/Sample.png";
        // Specify the path to the output file
        String outputFile = "ScanLocalImage_out.txt";
        
        // Create an OcrScanner object
        OcrScanner scanner = new OcrScanner();
        // Set the dependency file path for the OcrScanner object
        scanner.setDependencies(dependencies);
        
        // Use the OcrScanner object to scan the specified image file
        scanner.scan(imageFile);
        
        // Get the scanned text content
        String scannedText = scanner.getText().toString();
        
        // Create an output file object
        File output = new File(outputFile);
        // If the output file already exists, delete it
        if (output.exists()) {
            output.delete();
        }
        // Create a BufferedWriter object to write content to the output file
        BufferedWriter writer = new BufferedWriter(new FileWriter(outputFile));
        // Write the scanned text content to the output file
        writer.write(scannedText);
        // Close the BufferedWriter object to release resources
        writer.close();
    }
}

Specify the Language File to Scan and Recognize Text from an Image

Java

import com.spire.ocr.OcrScanner;
import java.io.*;

public class ScanImageWithLanguageSelection {
    public static void main(String[] args) throws Exception {
        // Specify the path to the dependency files
        String dependencies = "dependencies/";
        // Specify the path to the language file
        String languageFile = "data/japandata";
        // Specify the path to the image file to be scanned
        String imageFile = "data/JapaneseSample.png";
        // Specify the path to the output file
        String outputFile = "ScanImageWithLanguageSelection_out.txt";
        
        // Create an OcrScanner object
        OcrScanner scanner = new OcrScanner();
        // Set the dependency file path for the OcrScanner object
        scanner.setDependencies(dependencies);
        // Load the specified language file
        scanner.loadLanguageFile(languageFile);
        
        // Use the OcrScanner object to scan the specified image file
        scanner.scan(imageFile);
        // Get the scanned text content
        String scannedText = scanner.getText().toString();

        // Create an output file object
        File output = new File(outputFile);
        // If the output file already exists, delete it
        if (output.exists()) {
            output.delete();
        }

        // Create a BufferedWriter object to write content to the output file
        BufferedWriter writer = new BufferedWriter(new FileWriter(outputFile));
        // Write the scanned text content to the output file
        writer.write(scannedText);
        // Close the BufferedWriter object to release resources
        writer.close();
    }
}

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Recognize Text