Spire.PDF is a professional PDF library applied to creating, writing, editing, handling and reading PDF files without any external dependencies. Get free and professional technical support for Spire.PDF for .NET, Java, Android, C++, Python.

Fri Oct 13, 2023 2:31 pm

I have a pdf document in the attachment, but when I try to convert it into a text - it is breaking on random place.

YordanY1
 
Posts: 2
Joined: Fri Oct 13, 2023 2:18 pm

Mon Oct 16, 2023 3:31 am

Hi,

Thank you for your inquiry.
I have created a console project (. NET Framework 4. 8 ) to test the file you provided through the latest version Spire.PDF for .NET 9.10.2, but I didn't reproduce your issue. Did you use the latest version of Spire.PDF for .NET? if not,Please download the latest version and retest through this link(https://www.e-iceblue.com/Download/download-pdf-for-net-now.html).
I put the complete code below for your referenece:
Code: Select all
//Create a PdfDocument object
            PdfDocument doc = new PdfDocument();

            //Load PDF file
            doc.LoadFromFile(@"E:\\34831\\metro\\metro.pdf");     
            PdfPageBase page = doc.Pages[1];

            //Create a PdfTextExtractot object
            PdfTextExtractor textExtractor = new PdfTextExtractor(page);

            //Create a PdfTextExtractOptions object
            PdfTextExtractOptions extractOptions = new PdfTextExtractOptions();

            //Set isExtractAllText to true
            extractOptions.IsExtractAllText = true;
 
            //Extract text from the selected page
            string text = textExtractor.ExtractText(extractOptions);

            //Write the extracted text to a TXT file
            File.WriteAllText(@"E:\\34831\\metro\\ToText.txt", text);

If above message doesn’t help you, please offer the following information to help us do further investigation. Thank you in advance.
1)Your full test code that can reproduce your issue.
2)Application type, such as Console App, .NET Framework 4.8.
3)Your test environment, such as OS info (E.g. Windows 7, 64-bit).

Sincerely,
Ula
E-iceblue support team
User avatar

Ula.wang
 
Posts: 282
Joined: Mon Aug 07, 2023 1:38 am

Mon Oct 16, 2023 12:28 pm

hello. well.. it does extracts, but not the whole thing

YordanY1
 
Posts: 2
Joined: Fri Oct 13, 2023 2:18 pm

Tue Oct 17, 2023 1:44 am

Hi,

Thanks for your feedback.
I carefully compared the output result document with the original document you provided and found that all the text was extracted. I attached my result document for your reference. You can provide your output result document and point out that specific information has not been extracted,you can attach here or send it to us via email (support@e-iceblue.com).Thank you in advance.

Sincerely,
Ula
E-iceblue support team
User avatar

Ula.wang
 
Posts: 282
Joined: Mon Aug 07, 2023 1:38 am

Tue Nov 07, 2023 3:13 am

Hello,

Did you solve your problem? Please tell us your solution results, your feedback is very important to us.
Please feel free to contact us if you have any problem.

Best Regards,
Ula
E-iceblue support team
User avatar

Ula.wang
 
Posts: 282
Joined: Mon Aug 07, 2023 1:38 am

Return to Spire.PDF