Spire.PDF is a professional PDF library applied to creating, writing, editing, handling and reading PDF files without any external dependencies. Get free and professional technical support for Spire.PDF for .NET, Java, Android, C++, Python.

Thu Aug 22, 2024 12:34 pm

Hello,
I am currently testing Spire.PDF and I am wondering if it is possible to copy a text from one PDF/page to another PDF (with the exact coordinates/with the same font, etc)

I would imagine using the information from “textfrag” to create a text in another PDF

Code: Select all
List<PdfTextFragment> frags = finder.FindAllText();
foreach (PdfTextFragment textfrag in frags)
{
    // add textfrag to another pdf
}

BennyVAL
 
Posts: 7
Joined: Thu Aug 22, 2024 12:27 pm

Fri Aug 23, 2024 8:47 am

Hello,

Thank you for your inquiry.You can refer to the following sample code to copy the text into another PDF document. If there is any issue when testing your files, you can provide us with your input document and expected effect for further investigation.You can upload here or send it to us via email( support@e-iceblue.com ). Thank you in advance.
Code: Select all
PdfDocument oldPdf = new PdfDocument();
            oldPdf.LoadFromFile(@"in.pdf");
            //Get the first page
            PdfPageBase page = oldPdf.Pages[0];
            PdfDocument newPdf = new PdfDocument();
            SizeF size = page.Size;
            PdfPageBase newPage = newPdf.Pages.Insert(0, size, new PdfMargins(0, 0));
            PdfTextFindOptions findOptions = new PdfTextFindOptions();
            findOptions.Parameter = TextFindParameter.WholeWord;
            //Create an object for PdfTextFinder
            PdfTextFinder finder = new PdfTextFinder(page);
            finder.Options = findOptions;
            //Finding the specified text
            List<PdfTextFragment> frags = finder.Find("Sample test");
            if (frags.Count > 0)
            {
                foreach (PdfTextFragment textfrag in frags)
                {
                    String content = textfrag.Text;
                    RectangleF rectangleF = textfrag.Bounds[0];
                    String fontName = textfrag.TextStates[0].FontFamily;
                    Boolean isBold = textfrag.TextStates[0].IsBold;
                    float fontSize = textfrag.TextStates[0].FontSize;
                    Color color = textfrag.TextStates[0].ForegroundColor;
                    if (isBold)
                    {
                        newPage.Canvas.DrawString(content, new PdfTrueTypeFont(new Font(fontName, fontSize, FontStyle.Bold), true), new PdfSolidBrush(color), new RectangleF(rectangleF.X, rectangleF.Y, page.Canvas.ClientSize.Width, page.Canvas.ClientSize.Height));
                    }
                    else
                    {
                        newPage.Canvas.DrawString(content, new PdfTrueTypeFont(new Font(fontName, fontSize, FontStyle.Regular), true), new PdfSolidBrush(color), new RectangleF(rectangleF.X, rectangleF.Y, page.Canvas.ClientSize.Width, page.Canvas.ClientSize.Height));
                    }
                }
            }
            newPdf.SaveToFile(@"out.pdf");


Sincerely,
Amin
E-iceblue support team
User avatar

Amin.Gan
 
Posts: 277
Joined: Mon Jul 15, 2024 5:40 am

Fri Aug 23, 2024 9:57 am

That looks very promising, thank you very much!

One last small question: How can you make the new text invisible?

Edit: Oh i think "newPage.Canvas.SetTransparency(0);" should work :D

BennyVAL
 
Posts: 7
Joined: Thu Aug 22, 2024 12:27 pm

Mon Aug 26, 2024 9:54 am

Hello,

Thank you for your inquiry. I tested the 'newPage.Canvas.SetTransparency(0f)' you mentioned method has been proven to be correct, add this setting before drawing text, the content you draw will be invisible. If you have any other questions, please feel free to write to me.

Sincerely,
Amin
E-iceblue support team
User avatar

Amin.Gan
 
Posts: 277
Joined: Mon Jul 15, 2024 5:40 am

Tue Aug 27, 2024 9:37 am

Hello,
I have now tested with some documents, and I have noticed in the attached document that the copied text here is in portrait format and not in landscape format like the document.
With other documents it works in landscape format and the text is displayed correctly, so it seems to be a problem with the document. Can this be adjusted in the coding?

Code: Select all
 string org_ocr = "ocr.pdf";
 string org_non_ocr = "original.pdf";
 string final = "spire_copy.pdf";

 PdfDocument oldPdf = new PdfDocument();
 oldPdf.LoadFromFile(org_ocr);

 PdfDocument nonOCRPdf = new PdfDocument();
 nonOCRPdf.LoadFromFile(org_non_ocr);

 //Get the first pages
 PdfPageBase page = oldPdf.Pages[0];       
 PdfPageBase newPage = nonOCRPdf.Pages[0];       

 PdfTextFindOptions findOptions = new PdfTextFindOptions();
 findOptions.Parameter = TextFindParameter.WholeWord;

 //Create an object for PdfTextFinder
 PdfTextFinder finder = new PdfTextFinder(page);
 finder.Options = findOptions;
 //Finding the specified text
 List<PdfTextFragment> frags = finder.FindAllText();
 if (frags.Count > 0)
 {
     foreach (PdfTextFragment textfrag in frags)
     {
         String content = textfrag.Text;
         RectangleF rectangleF = textfrag.Bounds[0];
         String fontName = textfrag.TextStates[0].FontFamily;
         Boolean isBold = textfrag.TextStates[0].IsBold;
         float fontSize = textfrag.TextStates[0].FontSize;
         //Color color = textfrag.TextStates[0].ForegroundColor;
         //Color color = Color.FromArgb(255, Color.Empty);
         newPage.Canvas.SetTransparency(0);
         newPage.Canvas.DrawString(content, new PdfTrueTypeFont(new Font(fontName, fontSize, FontStyle.Regular), true), PdfBrushes.Transparent, new RectangleF(rectangleF.X, rectangleF.Y, page.Canvas.ClientSize.Width, page.Canvas.ClientSize.Height));
 
     }             
 }
 nonOCRPdf.SaveToFile(final);

BennyVAL
 
Posts: 7
Joined: Thu Aug 22, 2024 12:27 pm

Wed Aug 28, 2024 9:03 am

Hello,

Thank you for your inquiry.I have reviewed your documents and found that the 'original. pdf' has a 90 degree rotation. I have adjusted your code to make it work properly.If you have any question, please feel free to contact us.

Code: Select all
  PdfDocument oldPdf = new PdfDocument();
            oldPdf.LoadFromFile(org_ocr);

            PdfDocument nonOCRPdf = new PdfDocument();
            nonOCRPdf.LoadFromFile(org_non_ocr);

            //Get the first pages
            PdfPageBase page = oldPdf.Pages[0];
            PdfPageBase newPage = nonOCRPdf.Pages[0];
            if (newPage.Rotation == PdfPageRotateAngle.RotateAngle90)
            {
                newPage.Canvas.RotateTransform(-90);
                newPage.Canvas.TranslateTransform(-page.Canvas.Size.Width, 0);
                newPage.Canvas.Restore();
            }
            PdfTextFindOptions findOptions = new PdfTextFindOptions();
            findOptions.Parameter = TextFindParameter.WholeWord;

            //Create an object for PdfTextFinder
            PdfTextFinder finder = new PdfTextFinder(page);
            finder.Options = findOptions;
            //Finding the specified text
            List<PdfTextFragment> frags = finder.FindAllText();
            if (frags.Count > 0)
            {
                foreach (PdfTextFragment textfrag in frags)
                {
                    String content = textfrag.Text;
                    RectangleF rectangleF = textfrag.Bounds[0];
                    String fontName = textfrag.TextStates[0].FontFamily;
                    Boolean isBold = textfrag.TextStates[0].IsBold;
                    float fontSize = textfrag.TextStates[0].FontSize;
                    //Color color = textfrag.TextStates[0].ForegroundColor;
                    //Color color = Color.FromArgb(255, Color.Empty);
                    //  newPage.Canvas.SetTransparency(0);
                    newPage.Canvas.DrawString(content, new PdfTrueTypeFont(new Font(fontName, fontSize, FontStyle.Regular), true), PdfBrushes.Red, new RectangleF(rectangleF.X, rectangleF.Y, page.Canvas.ClientSize.Width, page.Canvas.ClientSize.Height));
 
                }
            }
            nonOCRPdf.SaveToFile(final);


Sincerely,
Amin
E-iceblue support team
User avatar

Amin.Gan
 
Posts: 277
Joined: Mon Jul 15, 2024 5:40 am

Wed Sep 11, 2024 10:57 am

Thanks, that works :D

Another question about the free version of Spire.PDF. I wanted to test the coding with the free version, it seems that
Code: Select all
  textfrag.TextStates[0].FontFamily;
  textfrag.TextStates[0].IsBold;
  textfrag.TextStates[0].FontSize;

...does not exist here, is this correct?

BennyVAL
 
Posts: 7
Joined: Thu Aug 22, 2024 12:27 pm

Thu Sep 12, 2024 7:28 am

Hello,

Thanks for your inquiry. Please note that the method for obtaining the font format of the found text was added in our commercial version(from 10.4.7). And the free version will not be updated and maintained on a regular basis by us, so we recommend using our commercial version as it includes more features and fixes. If you have any other questions, please feel free to write to me.

Sincerely,
Amin
E-iceblue support team
User avatar

Amin.Gan
 
Posts: 277
Joined: Mon Jul 15, 2024 5:40 am

Tue Nov 05, 2024 2:20 pm

Hello,
we are still testing the feature of copying text from one pdf to another.

we still have 2 open points

1. in some pdfs words are split by single blanks. for example in the attached example pdfs (example.zip)
original (original_ocr.pdf): "laufender Betrieb der virtuellen Clients/Desktops"
copied text (spire_copy_ocr.pdf): "l auf ender Bet rieb der virt uell en Client s/Deskt ops"

as a workaround, i can increase the text size a little so that the letters are a little closer together, so the result is better, but then, as a side effect, one or the other “real” blank space disappears. is there a better solution?

coding: fontSize = fontSize + 1.5f;

2. we still have problems with page rotations. an example pdf (unfortunately a customer pdf that i cannot provide) has a rotation of 270. the copied text is not displayed correctly in this pdf. what should the coding look like? (based on the above example of the “RotateAngle90” which works well)


another question
Code: Select all
PdfTextFinder finder = new PdfTextFinder(page);
findOptions.Parameter = TextFindParameter.WholeWord;
finder.Options = findOptions;
 List<PdfTextFragment> frags = finder.FindAllText();


here, there only seem to be letters in “frags” - I would have expected whole words (parameter “WholeWord”) to be found here. Or am I misunderstanding the parameter?

thank you!

BennyVAL
 
Posts: 7
Joined: Thu Aug 22, 2024 12:27 pm

Tue Nov 05, 2024 2:22 pm

here are the files

BennyVAL
 
Posts: 7
Joined: Thu Aug 22, 2024 12:27 pm

Wed Nov 06, 2024 10:29 am

Hello,

Thanks for your inquiry. For your question 1, I reproduced the problem using the PDF you provided. This issue has been logged in our bug tracking system under the number SPIREPDF-7174. Our Dev team will investigate it further, once there is any update, we will let you know.
For your question 2, do you mean copying content into an original PDF document with a rotation angle of 270? You can provide us with an example of a PDF document and tell us the text you want to copy, and we can provide you with the corresponding example.
Finally, the FindAllText() method aims to find all the text on the page without the need for an additional parameter WholeWord.

Sincerely,
Amin
E-iceblue support team
User avatar

Amin.Gan
 
Posts: 277
Joined: Mon Jul 15, 2024 5:40 am

Thu Nov 07, 2024 9:04 am

Thank you for your answers.

1. Can you estimate how long it will take to fix the bug?

2. I was able to create an example PDF (rotate.zip)
rotation_org.pdf was “scanned” in landscape format and then rotated by 270 degrees. The text copy is wrong here (see rotation_spire.pdf)
maybe there is a flexible coding for all possible rotations of the source document (90/180/270)?

BennyVAL
 
Posts: 7
Joined: Thu Aug 22, 2024 12:27 pm

Fri Nov 08, 2024 10:12 am

Hello,

Thanks for your feedback. There is currently no exact repair time for SPIREPDF-7174. However, I have raised the priority of this issue and will notify you immediately once there are any updates.
Then, for documents with a rotation angle of 270 degrees, please refer to the following code to translate coordinates and coordinate axes. Finally, I need time to study PDF documents with a rotation angle of 180 degrees, and I will notify you immediately if I make any discoveries. Thank you for your understanding.
Code: Select all
 PdfDocument oldPdf = new PdfDocument();
        oldPdf.LoadFromFile(@"F:\forum-attachment\11.8\rotate\in.pdf");

        PdfDocument newPdf = new PdfDocument();
        newPdf.LoadFromFile(@"F:\forum-attachment\11.8\rotate\rotation_spire.pdf");

        //Get the first pages
        PdfPageBase page = oldPdf.Pages[0];
        PdfPageBase newPage = newPdf.Pages[0];
        Console.WriteLine(newPage.Rotation);
        if (newPage.Rotation == PdfPageRotateAngle.RotateAngle270)
        {
            newPage.Canvas.RotateTransform(90);
            newPage.Canvas.TranslateTransform(0, -newPage.Size.Width);
           newPage.Canvas.Restore();
        }
       。。。。

Sincerely,
Amin
E-iceblue support team
User avatar

Amin.Gan
 
Posts: 277
Joined: Mon Jul 15, 2024 5:40 am

Return to Spire.PDF