Spire.XLS is a professional Excel API that enables developers to create, manage, manipulate, convert and print Excel worksheets. Get free and professional technical support for Spire.XLS for .NET, Java, Android, C++, Python.

Tue May 02, 2023 2:11 pm

Hi team,

I have an Excel file with 6 data sheets, in each sheet, I wanna remove rows based on data of one column. About 30K data and approximately 15k (half) data are to be deleted within each sheet.

I used tips from the following topic to delete rows deleting-a-row-that-contains-specific-string-t9156.html which works fine functionally. However, performance is a problem for me.
I put a stopwatch to see the time consumption of the DeleteRow method, which took ~100 milliseconds on average.
Therefore for 1 sheet, 15k data, is about 25 mins.

Do you have any recommendations that can speed up this operation? Or is there another good idea to achieve the goal?

Thanks in advance

Superego
 
Posts: 13
Joined: Fri Oct 09, 2020 5:18 am

Wed May 03, 2023 2:27 am

Hi,

Thanks for your feedback.
I simulated a 1k rows Excel file and deleted them, using the code you mentioned, the 1048 rows data are deleted within 300 milliseconds, I have attached the screenshots for your reference.
test.jpg
test.jpg (93.84 KiB) Viewed 1844 times


Are you using the latest Spire.XLS (13.4.0)? if not, you can upgrade and have a test. If the problem still exists, please share us with the following messages. Thanks for your assistance, you can send them to us via email (support@e-iceblue.com) or attach them here.
1) Your test document.
2) Your application type, such as Console App, .NET Framework 4.8.
3) Your device info, such as (Windows 10 64 bit, 16G memory).

Best regards,
Triste
E-iceblue support team
User avatar

Triste.Dai
 
Posts: 1000
Joined: Tue Nov 15, 2022 3:59 am

Wed May 03, 2023 10:37 am

Hello,

Thanks for your prompt reply!
I'm using the following code logic to filter and remove rows.
The filtering logic spent 76ms which gathered ~16.8k rows out of ~26k rows, after that it took ~1736345 ms (almost 29 mins) to delete rows.
To answer your questions:
1. I will share a sample file through email
2. It's a .NET 7 based console application
3. My laptop configuration: 11th Gen Intel i7-1185G7 with 32GB memory

Code: Select all
public void RemoveDataOfOtherOptimizers(Workbook workbook, string targetOptimizer, string sheetName, int optimizerColumnRangeId)
        {
            var sheet = workbook.Worksheets[sheetName];
            var rowCount = sheet.Rows.Length;

            if (rowCount <= 1)
                return;

            IList<CellRange> celList = new List<CellRange>();

            var sw = new Stopwatch();
            sw.Start();

            for (var i = 2; i <= rowCount; i++)
            {
                var cell = sheet.Range[i, optimizerColumnRangeId];
                if (cell == null)
                    continue;
                var optimizer = cell.HasFormula ? cell.FormulaStringValue : cell.Value;
                if (!optimizer.Equals(targetOptimizer, StringComparison.InvariantCultureIgnoreCase))
                {
                    celList.Add(cell);
                }
            }

            var rowIndex = celList.Select(range => range.Row).ToList();
            var time = sw.ElapsedMilliseconds;

            sw.Restart();
            for (var i = 0; i < rowIndex.Count; i++)
            {               
                var itemIndex = rowIndex[i] - i;               
                sheet.DeleteRow(itemIndex);
               
            }
            time = sw.ElapsedMilliseconds;
            sw.Stop();
        }

Superego
 
Posts: 13
Joined: Fri Oct 09, 2020 5:18 am

Wed May 03, 2023 10:41 am

BTW, above result was based on Spire.XLS (13.4.0)

Superego
 
Posts: 13
Joined: Fri Oct 09, 2020 5:18 am

Thu May 04, 2023 9:43 am

Hi,

Thank you for reaching out to us regarding the slow deletion of rows when using our Spire.XLS. I tested this with the document and code you provided and reproduced your issue, the speed is slow.

We have recorded this issue in our issue tracking system with the ticket number SPIREXLS-4647, and our developers will investigate the root cause of the slow deletion. We will keep you updated on any progress made.

Thank you for bringing this to our attention. If you have any further concerns or questions, please don't hesitate to let us know.

Best regards,
Triste
E-iceblue support team
User avatar

Triste.Dai
 
Posts: 1000
Joined: Tue Nov 15, 2022 3:59 am

Thu May 04, 2023 1:45 pm

Hi Triste,

Thanks for your quick confirmation.
I eagerly look forward to the fix as it's a blocker issue in front of me.

Thanks & Bes regards,
Jun

Superego
 
Posts: 13
Joined: Fri Oct 09, 2020 5:18 am

Fri May 05, 2023 3:02 am

Hi,

Thanks for your feedback.
Our developers have started investigating this issue, please rest assured that we are working on resolving this problem as soon as possible. We will keep you updated on any progress made. Thank you for your patience.

Best regards,
Triste
E-iceblue support team
User avatar

Triste.Dai
 
Posts: 1000
Joined: Tue Nov 15, 2022 3:59 am

Mon May 08, 2023 3:44 am

Hi,

Thanks for your patience.
Our development team has investigated and found that the cause of the slow speed is due to the update of all cell formulas, shape positions, conditional formats, etc. for each row deleted.
We recommend replacing the "sheet.DeleteRow(itemIndex);" code in your application with the following code snippet:

Code: Select all
    for (var i = 0; i < rowIndex.Count; i++)
    {
        var itemIndex = rowIndex[i] - i;
        int count = 1;
        for (int j = i + 1; j < rowIndex.Count; j++)
        {
            if (itemIndex == rowIndex[j] - j)
                count++;
            else
            {
                i = j - 1;
                break;
            }
        }
        sheet.DeleteRow(itemIndex, count);

    }

This code will merge adjacent rows that need to be deleted and call the DeleteRow method with the parameters to delete multiple rows at once, reducing the number of updates needed and improving the speed of execution.
Thank you for using our product. Please let us know if you have any further questions or concerns.

Best regards,
Triste
E-iceblue support team
User avatar

Triste.Dai
 
Posts: 1000
Joined: Tue Nov 15, 2022 3:59 am

Thu May 11, 2023 12:54 pm

Hi Triste,

Sorry for my late response, just getting better from sick leave.
Thanks for your feedback with a detailed solution.

I used it to test the sample file "20230502 - DeleteRow - Sample.xlsx" which I shared in the previous session.
My case is to keep only "王二博" in all sheets and remove the data of other members.

The result comparison:

Before
>"大航海T+1收入" (16881 of 28624 rows to be deleted) - I tried only for the first sheet "大航海T+1收入", which spent around 25 mins
>"大航海T+8收入" (23871 of 27633 rows to be deleted)- I did not try as it's too time-consuming

After (with the new logic)
>"大航海T+1收入"- It spent about 4.5 mins. Impressive. :lol:
>"大航海T+8收入" - I spent about 30 mins. Still slow :shock:

It seems the new logic can improve a bit the performance but has not eradicated the problem.
I guess it's due to the adjacent rows' location and the row count.

BTW, as you mentioned
The cause of the slow speed is due to the update of all cell formulas

I tried to set the CalculationMode mode to ExcelCalculationMode.Manual. Unfortunately, it's getting worse.

Thanks.
BR,
Jun

Superego
 
Posts: 13
Joined: Fri Oct 09, 2020 5:18 am

Fri May 12, 2023 3:15 am

Hi,

We appreciate your feedback and have investigated the matter with our development team.
Please be assured that we have forwarded this information to our development team for further investigation. They will review our internal source code to determine if any further optimization is necessary.

We apologize for any inconvenience caused and assure you that we are committed to resolving this issue as soon as possible. We will keep you informed of any progress made in this regard.

Thank you for your patience and understanding.

Best regards,
Triste
E-iceblue support team
User avatar

Triste.Dai
 
Posts: 1000
Joined: Tue Nov 15, 2022 3:59 am

Tue May 23, 2023 1:20 am

Hi Team,

Any update on this topic?
Thanks in advance

BR,
Jun

Superego
 
Posts: 13
Joined: Fri Oct 09, 2020 5:18 am

Tue May 23, 2023 5:36 am

Hi,

Thank you for your inquiry regarding the progress of your issue. We are pleased to inform you that our development team has provided a new interface for deleting rows, and the issue has been submitted for further action.
However, we would like to inform you that the speed of the deletion process has not yet been tested. Rest assured that we are fully committed to resolving your issue as quickly and efficiently as possible.

Best regards,
Triste
E-iceblue support team
User avatar

Triste.Dai
 
Posts: 1000
Joined: Tue Nov 15, 2022 3:59 am

Tue May 23, 2023 6:36 am

Hi Triste,

Thanks for your quick feedback.
I'm delighted to hear that and cannot wait to integrate with the new interface. :lol:
Look forward to your further update.

BR,
Jun

Superego
 
Posts: 13
Joined: Fri Oct 09, 2020 5:18 am

Tue May 23, 2023 8:57 am

Hi,

Thanks for your feedback.
If the speed of deleting rows is significantly optimized, we will inform you of the new version. Thanks for your patience and understanding.
Please feel free to contact us if you have any further questions or concerns.

Best regards,
Triste
E-iceblue support team
User avatar

Triste.Dai
 
Posts: 1000
Joined: Tue Nov 15, 2022 3:59 am

Thu Jun 01, 2023 9:50 am

Hi Triste,

I just tested the newly added interface public void DeleteRows(int[] rows) in Spire.XLS 13.5.6.
Unfortunately, I met an issue that it did not remove all the target rows in int[] rows.

I was still using the sample file 20230502 - DeleteRow - Sample.xlsx I shared with you before.
The test case is keeping only "王二博" and removes data of other members in sheet "大航海T+1收入".

Thanks to the new interface, code logic becomes much more straightforward as below.

Code: Select all
            IList<int> rowList = new List<int>();

            //Skip the header row
            for (var i = 2; i <= rowCount; i++)
            {
                var cell = sheet.Range[i, optimizerColumnRangeId];
                if (cell == null)
                    continue;
                var optimizer = cell.HasFormula ? cell.FormulaStringValue : cell.Value;
                if (!optimizer.Equals(targetOptimizer, StringComparison.InvariantCultureIgnoreCase) &&
                    !optimizer.Equals(targetOptimizer + ReportConsts.MergedPostfix, StringComparison.InvariantCultureIgnoreCase))
                {
                    rowList.Add(cell.Row);
                }
            }

            var rowsToBeDeleted = rowList.ToArray();

            sheet.DeleteRows(rowsToBeDeleted);



Thanks,
BR,
Jun

Superego
 
Posts: 13
Joined: Fri Oct 09, 2020 5:18 am

Return to Spire.XLS