SoFunction
Updated on 2025-05-09

C# real-life guide to implementing high-performance Excel million-dollar data export optimization

Excel data export is a common requirement in daily work.

However, when the amount of data is large, performance and memory problems often become bottlenecks that limit export efficiency.

When a user clicks the "Export" button, the background system often falls into a triple dilemma:

‌Memory Black Hole‌: When an e-commerce platform exported millions of orders, the heap memory exceeded 4GB due to the traditional POI solution, which frequently triggered Full GC, and eventually triggered a service avalanche;

‌Time vortex‌: It took 45 minutes for a certain logistics system to export 500,000 waybills, and the user tried many times, resulting in the exhaustion of the database connection pool;

‌Disk Storm‌: A financial platform exports transaction records to generate 1.2GB files, and the server disk IO soars to 100%;

We compare and implement C# high-performance Excel export solutions of EPPlus, MiniExcel and NPOI to study how to improve export efficiency.

1. Comparison of technical solutions

characteristic ‌EPPlus‌ ‌MiniExcel‌ ‌NPOI‌
Processing Model DOM SAX Streaming DOM/streaming mix
Memory usage (1 million rows) 1.2GB 180MB 850MB
File format support .xlsx .xlsx/.csv .xls/.xlsx
Formula calculation support Not supported Partial support
Template Engine built-in Template syntax Need to expand
Asynchronous support limited Fully support Not supported
NuGet installation volume 120 million+ 8 million+ 230 million+

2. Selection suggestions for each plan

‌Scene‌ Recommended plan ‌Example Code Features‌
Simple data export MiniExcel Streaming Write useSaveAsAsync+ Block Generator
Complex format reports EPPlus template engine Style predefined + segmented save
Old version of Excel compatible NPOI Streaming Write useSXSSFWorkbook
Mixed demand MiniExcel + EPPlus combination Template separation + data streaming fill
Extra large data volume (10 million) Shard writing + parallel processing Multi-task shards + final merge

3. Performance comparison data

Test items‌ EPPlus MiniExcel NPOI
1 million lines of writing time 42s 18s 65s
Memory peak value 1.1GB 190MB 820MB
File size 86MB 68MB 105MB
GC Pause Time 1.4s 0.2s 2.1s
Thread resource usage high Low middle

4. Core code implementation

1. MiniExcel Streaming Write (recommended solution)

// Configure optimization parametersvar config = new OpenXmlConfiguration
{
    EnableSharedStrings = false, // Close the shared string table    AutoFilterMode = , // Disable automatic filtering    FillMergedCells = false // No merged cells are processed};

// Pagination streaming writingawait ("", GetDataChunks(), configuration: config);

IEnumerable<IDictionary<string, object>> GetDataChunks()
{
    var pageSize = 50000;
    for (int page = 0; ; page++)
    {
        var data = QueryDatabase(page * pageSize, pageSize);
        if (!()) yield break;
        
        foreach (var item in data)
        {
            yield return new Dictionary<string, object>
            {
                ["ID"] = ,
                ["Name"] = ,
                ["CreateTime"] = ("yyyy-MM-dd")
            };
        }
    }
}

Optimization points:

  • Loading database data on pages
  • Delay loading data generator
  • Turn off non-essential features

2. EPPlus hybrid writing scheme

using (var package = new ExcelPackage())
{
    var sheet = ("Data");
    int row = 1;

    // Batch write header information    ["A1:C1"].LoadFromArrays(new[] { new[] { "ID", "Name", "CreateTime" } });

    // Block write (save once every 50,000 lines)    foreach (var chunk in GetDataChunks(50000))
    {
        [row+1, 1].LoadFromCollection(chunk);
        row += ;
        
        if (row % 50000 == 0)
        {
            (); // Save in segments            ();
        }
    }
    
    (new FileInfo("output_epplus.xlsx"));
}

3. Performance comparison test code

[]
public class ExcelBenchmarks
{
    private List<DataModel> _testData = GenerateTestData(1_000_000);

    [Benchmark]
    public void MiniExcelExport() => ("", _testData);

    [Benchmark]
    public void EPPlusExport() 
    {
        using var pkg = new ExcelPackage();
        var sheet = ("Data");
        (_testData);
        ("");
    }

    [Benchmark]
    public void NPOIExport()
    {
        var workbook = new XSSFWorkbook();
        var sheet = ("Data");
        for (int i = 0; i < _testData.Count; i++)
        {
            var row = (i);
            (0).SetCellValue(_testData[i].Id);
            (1).SetCellValue(_testData[i].Name);
        }
        using var fs = new FileStream("", );
        (fs);
    }
}

5. Implementation of hybrid solutions

1. EPPlus + MiniExcel Combination Solution

// Create a styled template with EPPlus firstusing (var pkg = new ExcelPackage(new FileInfo("")))
{
    var sheet = [0];
    ["A1"].Value = "Dynamic Report";
    ();
}

// Use MiniExcel to fill in large data volumevar data = GetBigData();
("", "", data);

2. Sharding asynchronous export scheme

public async Task ExportShardedDataAsync()
{
    var totalRecords = 5_000_000;
    var shardSize = 100_000;
    var shards = totalRecords / shardSize;

    var tasks = new List<Task>();
    for (int i = 0; i < shards; i++)
    {
        var start = i * shardSize;
        ((async () => 
        {
            using var stream = new FileStream($"shard_{i}.xlsx", );
            await (stream, QueryData(start, shardSize));
        }));
    }

    await (tasks);
    MergeShardFiles(shards);
}

private void MergeShardFiles(int shardCount)
{
    using var merger = new ExcelPackage();
    var mergedSheet = ("Data");
    
    int row = 1;
    for (int i = 0; i < shardCount; i++)
    {
        var shardData = ($"shard_{i}.xlsx");
        [row, 1].LoadFromDictionaries(shardData);
        row += ();
    }
    
    (new FileInfo(""));
}

6. Advanced optimization strategy

1. Memory management configuration

// Global configuration("", true); // Enable shared array pool
// Runtime configuration (){
  "configProperties": {
    "": "0x100000000", // 4GB memory limit    "": "70",
    "": true
  }
}

2. Database optimization

// Dapper pagination optimizationpublic IEnumerable&lt;DataModel&gt; GetPagedData(long checkpoint, int size)
{
    return _conn.Query&lt;DataModel&gt;(
        @"SELECT Id, Name, CreateTime 
        FROM BigTable 
        WHERE Id &gt; @Checkpoint 
        ORDER BY Id 
        OFFSET 0 ROWS 
        FETCH NEXT @Size ROWS ONLY 
        OPTION (RECOMPILE)", // Force recompile execution plan        new { checkpoint, size });
}

3. Exception handling enhancement

try
{
    await ExportDataAsync();
}
catch (MiniExcelException ex) when ( == "DISK_FULL")
{
    await CleanTempFilesAsync();
    await RetryExportAsync();
}
catch (SqlException ex) when ( == 1205) // Retry deadlock{
    await (1000);
    await RetryExportAsync();
}
finally
{
    _semaphore.Release(); // Release the semaphore}

7. Summary of best practices

‌1. Data paging strategy

Use Ordered ID Paging to Avoid OFFSET Performance Attenuation

// Optimize pagination queryvar lastId = 0;
while (true)
{
    var data = Query($"SELECT * FROM Table WHERE Id &gt; {lastId} ORDER BY Id FETCH NEXT 50000 ROWS ONLY");
    if (!()) break;
    lastId = ().Id;
}

‌2. Memory control trinity‌

  • Enable server GC mode
  • Configure a shared array pool
  • Use object pool multiplexing DTO

3‌, Exception handling pyramid

try {
    // Core logic} 
catch (IOException ex) when (("Disk Space")) {
    // Disk exception handling}
catch (SqlException ex) when ( == 1205) {
    // Database deadlock processing}
catch (Exception ex) {
    // General exception handling}

8. Guide to avoid pits

Common Traps

‌EPPlus memory leak

// Error example: ExcelPackage not releasedvar pkg = new ExcelPackage(); // Must be wrapped in using("");

// Correct usageusing (var pkg = new ExcelPackage())
{
    // Operation code}

File locking for NPOI

// Error example: Resource not released correctlyvar workbook = new XSSFWorkbook();
// Correct usageusing (var fs = new FileStream("", ))
{
    (fs);
}

Exception handling best practices

try
{
    await ExportAsync();
}
catch (MiniExcelException ex) when ( == "DISK_FULL")
{
    _logger.LogError("Insufficient disk space: {Message}", );
    await CleanTempFilesAsync();
    throw new UserFriendlyException("Export failed, please contact the administrator");
}
catch (DbException ex) when ()
{
    _logger.LogWarning("Database temporary error, try again");
    await (1000);
    await RetryExportAsync();
}
finally
{
    _exportSemaphore.Release();
}

9. Typical scenario suggestions‌

  • ‌Financial Statements‌ → EPPlus (Complex Formula + Chart)
  • ‌Log Export‌ → MiniExcel (10 million streaming processing)
  • ‌Old System Migration‌ → NPOI (xls compatible)
  • ‌Dynamic Template‌ → MiniExcel Template Engine

Through reasonable solution selection and optimized configuration, it can be achieved:

  • ‌Memory consumption‌ reduced by more than 80%
  • ‌Export speed‌ is increased by 3-5 times
  • System stability is significantly enhanced

This is the article about C#’s practical guide to implementing high-performance Excel Million Data Export Optimization. For more related C# Excel data export content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!