Hi
I am reading the text of pdf invoices to extract different informations (eg. customer, id, amount etc.) and am currently trying out different variants.
What exactly does
PdfTextExtractOptions.IsExtractAllText
do?
Regards Peter
String input = @"..\..\..\..\..\..\Data\PDFTemplate-Az.pdf";
PdfDocument doc = new PdfDocument();
// Read a pdf file
doc.LoadFromFile(input);
// Get the first page
PdfPageBase page = doc.Pages[0];
// Extract text from page keeping white space
PdfTextExtractOptions options = new PdfTextExtractOptions();
options.IsExtractAllText = true; //false->Extract text from page without keeping white space
PdfTextExtractor pdfTextExtractor = new PdfTextExtractor(page);
String text = pdfTextExtractor.ExtractText(options);
String result = Path.GetFullPath("ExtractTextFromParticularPage_out.txt");
// Create a writer to put the extracted text
TextWriter tw = new StreamWriter(result);
// Write a line of text to the file
tw.WriteLine(text);