Write Large Array to CSV File in Node.js

In this article, I have covered how to write large array to CSV file in Node.js. When writing large arrays to CSV files using Node.js, it’s crucial to prioritize memory efficiency and performance. This function has been crafted for this specific purpose, and we’ll provide several practical examples of its usage.

Write Large Array to CSV File in Node.js

Function for Writing Large Array to CSV

const fs = require('fs');
const { Transform } = require('stream');

function writeToCsvFile(data, filename, headers = []) {
    return new Promise((resolve, reject) => {
        const transform = new Transform({
            objectMode: true,
            transform(chunk, encoding, callback) {
                this.push(`${Object.values(chunk).join(',')}\n`);
                callback();
            }
        });

        const writeStream = fs.createWriteStream(filename);
        writeStream.on('finish', resolve);
        writeStream.on('error', reject);

        if (headers.length > 0) {
            writeStream.write(`${headers.join(',')}\n`);
        }

        data.forEach(item => transform.write(item));
        transform.end();

        transform.pipe(writeStream);
    });
}

async function exportToCsv() {
    const largeArray = [...Array(100000)].map((_, i) => ({ id: i, value: `Item ${i}` }));
    
    try {
        await writeToCsvFile(largeArray, 'largeData.csv', ['ID', 'Value']);
        console.log('The CSV file was written successfully.');
    } catch (e) {
        console.error('An error occurred while writing the CSV file.', e);
    }
}

exportToCsv();

In the code above, the writeToCsvFile function takes three arguments:

  1. data: The large array you want to write to the CSV file.
  2. filename: The path where the CSV file will be created.
  3. headers (optional): An array of strings representing the column headers.

This function efficiently handles the array by utilizing Node.js’s stream API, converting it into a readable stream of data. The Transform stream actively modifies each chunk of the array into a string formatted as a CSV row. Subsequently, we establish a writable stream to the target filename and actively manage both finish and error events.

Finally, we check if any headers are provided, write them to the CSV first, and then write every item in the array through the Transform stream by piping it directly to the file stream.This process actively breaks down the file writing into manageable chunks, actively preventing the simultaneous loading of an excessive amount of data into memory. This is especially important when dealing with large datasets.

Now, let’s see the writeToCsvFile function in action with a different example:

async function exportUsersToCsv(users) {
    try {
        await writeToCsvFile(users, 'users.csv', ['UserID', 'Username', 'Email']);
        console.log('The users CSV file was written successfully.');
    } catch (e) {
        console.error('An error occurred while writing the users CSV file.', e);
    }
}

// Suppose you have a large array of user objects
const usersArray = [
    { UserID: 1, Username: 'john_doe', Email: '[email protected]' },
    // ... many more user objects
];

exportUsersToCsv(usersArray);

In this second example, assume that usersArray represents a large dataset of user information. By invoking exportUsersToCsv, you actively write extensive array data to a CSV file without excessively consuming memory, even when the array encompasses thousands of user records.

Active writing of substantial arrays to CSV files is a common data processing task, and Node.js equips developers with the tools to efficiently execute it. By employing streams and following the demonstrated approach in the code above, developers can effectively manage memory usage and execute file I/O for the most extensive datasets.